FUNCTIONAL ANALYSIS 


PETER D. LAX 

Courant Institute 

New York University LT. MERKEZ KUTOPHAN JES] 
l Demirbaş Mo: LAN. NAI diaaa 


Küt. Giris NO l essserrereesseseererns 


-Mw 


D WILEY- 
V INTERSCIENCE 


8 


A JOHN WILEY & SONS, INC., PUBLICATION 


Ey 


This book is printed on acid-free paper. 
Copyright © 2002 by John Wiley & Sons, Inc. AH rights reserved. 
Published simultaneously in Canada. 


No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form 
or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as 
permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior 

4 written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to 

: the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 

750-4744, Requests to the Publisher for permission should be addressed to the Permissions Department, 
John Wiley & Sons, Inc., 605 Third Avenue, New York, NY 10158-0012, (212) 850-6011, fax (212) 
850-6008. E-Mail: PERMREQ@WILEY.COM. 


For ordering and customer service, call |-800-CALL-WILEY. 


Library of Congress Cataloging-in-Publication Data 
Lax, Peter D. 
Functional analysis / Peter D. Lax. 
.p. cm. 
Includes bibliographical references and index. 
ISBN 0-471-55604-1 (cloth : alk. paper) ` 
1. Functional analysis. I. Title. 


QA320.L345 2002 
515—dc21 2001046547 


Printed in the United States of America 


1098765432 


CONTENTS 


Foreword xvii 


1. Linear Spaces l 1 


Axioms for linear spaces—Infinite-dimensional examples— 
Subspace, linear span—Quotient space—Isomorphism—Convex 
sets—Extreme subsets 


N 


Linear Maps 8 


2.1 Algebra of linear maps, 8 


Axioms for linear maps—Sums and composites—Invertible 
linear maps—Nullspace and range—Invariant subspaces 


2.2. Index of a linear map, 12 


Degenerate maps—Pseudoinverse—Index—Product formula for 
the index—Stability of the index 


The Hahn-Banach Theorem 19 


3.1 The extension theorem, 19 


Positive homogeneous, subadditive functionals—Extension of 
linear functionals—Gauge functions of convex sets 


3.2 Geometric Hahn-Banach theorem, 21 


The hyperplane separation theorem 
3.3 Extensions of the Hahn-Banach theorem, 24 


The Agnew-Morse theorem—The 
Bohnenblust-Sobczyk-Soukhomlinov theorem 


©) Applications of the Hahn-Banach theorem 29 


4.1 Extension of positive linear functionals, 29 
4.2 Banach limits, 31 


vi CONTENTS 


4.3 Finitely additive invariant set functions,- 33 
Historical note, 34 z 


5. Normed Linear Spaces Bot mtie E amta a E e B6 


5.1 Norms, 36 


Norms for quotient spacés—Complete normed linear spaces— ~ 


The spaces C; B—L” spaces and Holder's ineguality=Soboley. V 
spaces, embedding theorems—Separable spaces ` 7 


5.2 Noncompactness of the unit ball, 43 - 
„Uniform convexity—The Mazur-Ulam theorem on isometries = 
5.3 Isometries, 47 i 


6. Hilbert Space l 52 


6.1 Scalar product, 52 
Schwarz io guay Paralleioeram identity—Completeness, 
closure—£?, L? 

6.2 Closest point in a closed convex subset, 54 
Orthogonal complement of a subspace = Orthononal 

s decomposition——-— Sets e E EET 

63 Linear functionals, 56 : 

"The Riesz-Frechet representation theorem—Lax-Milgram lemma 


© Linear span, 58 


Orthogonal projection—Orthonormal bases, Gram-Schmidt 
process—Isometries of a Hilbert space... 


7. Applications of Hilbert Space Results 63 
7.1 Radon-Nikodym theorem, 63 
7.2 Dirichlet’s problem, 65 
Use of the Riesz-Frechet horem sof the Lariam 
theorem—Use of orthogonal decomposition «: 
72 


Duals of Normed Linear Spaces 


8.1 Bounded linear functionals, 72 


Dual space 
Extension of bounded linear functionals,- 


io 


linear span of a set 


CONTENTS vii 


8.3 Reflexive spaces, 78 


Reflexivity of LP, 1 < p < co—Separable spaces—Separability 
of the dual—Dual of C(Q), Q compact—Reflexivity of 
subspaces 


8.4 Support function of a set, 83 


Dual characterization of convex hull—Dual characterization of 
distance from a closed, convex set 


@ Applications of Duality 87 


í 9.1 Completeness of weighted powers, 87 
9.2 The Müntz approximation theorem, 88 
9.3 Runge’s theorem, 91 
9.4 Dual variational problems in function theory, 91 
9.5 Existence of Green’s function, 94 


Weak Convergence 99 


10.1 Uniform boundedness of weakly convergent sequences, 101 


Principle of uniform boundedness—Weakly sequentially 
closed convex sets 


10.2 Weak sequential compactness, 104 
Compactness of unit ball in reflexive space 
10.3 Weak* convergence, 105 
Helly’s theorem 


(uy) Applications of Weak Convergence see 108 


11.1 Approximation of the ô function by continuous functions, 108 
Toeplitz’s theorem on summability 
11.2 Divergence of Fourier series, 109 ee 
11.3 Approximate quadrature, 110 
11.4 Weak and strong analyticity of-vector-valued functions, 111 
woes EES Existence of ee ee et 
Galerkin’s method =~ sess ssssstssseese reese eters Gs 


11.6 -The representation of analytic functions with positive real part, 115 
Herglotz-Riesz theorem 


@) The Weak and Weak* Topologies oy ee 118 


Comparison with weak sequential topology—Closed convex sets 
in the weak topology—-Weak compactness—Alaoglu’s theorem 


= 


viii 


13. 


CONTENTS 


Locally Convex Topologies and the Krein-Milman Theorem 


13.4 Choquet’s theorem, 128 _ 


13.1 Separation of points by linear functionals, 123 
13.2 The Krein-Milman theorem, 124 
13.3 The Stone-Weierstrass theorem, 126 


122 


-T4 


Ezamples of Convex Sets and Their Extreme Points 


14.1 Positive functionals, 133 C - 
14.2 Convex functions, 135 - 


————————]4-3-—Completely monotone functions, 137 


14.4. Theorems of Carathéodory and Bochner, 141 
14.5 A theorem of Krein, 147 
14.6 Positive harmonic functions, 148--: --- 
14.7 The Hamburger moment problem, 150 
14.8 G. Birkhoff’s conjecture, 151 
14.9 De Finetti’s theorem, 156 
14.10 Measure-preserving mappings, 157 
Historical note, 159 


Bounded Linear Maps 


16. 


15.1 Boundedness and continuity, 160 

Norm of a bounded linear map—Transpose 
15.2 Strong and weak topologies, 165 

Strong and weak sequential convergence 
15.3 Principle of uniform boundedness, 166 
15.4 Composition of bounded maps, 167 
15.5 The open mapping principle, 168 

Closed graph theorem 
Historical note, .172 


Examples of Bounded Linear Maps 


16.1 Boundedness of integral operators, 173 
Integral operators of Hilbert-Schmidt type—Integral operators of 
Holmgren type 

16.2 The convexity theorem of Marcel Riesz, 177 

16.3 Examples of bounded integral operators, 180 
The Fourier transform, Parseval’s theorem and Hausdorff- Young 
inequality—The Hilbert transform—The Laplace transform— 
The Hilbert-Hankel transform 


160 


173 


CONTENTS 


16.4 Solution operators for hyperbolic equations, 186 
16.5 Solution operator for the heat equation, 188 


16.6 Singular integral operators, pseudodifferential operators and 


Fourier integral operators, 190 


17. Banach Algebras and their Elementary Spectral Theory 
17.1 Normed algebras, 192 


192 


Invertible elements—Resolvent set and spectrum—Resolvent— 


Spectral radius 
17.2 Functional calculus, 197 
Spectral mapping theorem—Projections 


18. Gelfand’s Theory of Commutative Banach Algebras 


Homomorphisms into C—Maximal ideals—Mazur’s lemma— 
The spectrum as the range of homomorphisms—The spectral 
mapping theorem revisited—The Gelfand representation— 


Gelfand topology 


19. Applications of Gelfand’s Theory of Commutative Banach Algebras 210 


19.1 The algebra C(S), 210 

19.2 Gelfand compactification, 210 

19.3 Absolutely convergent Fourier series, 212 

19.4 Analytic functions in the closed unit disk, 213 
Analytic functions in the closed polydisk 

19.5 Analytic functions in the open unit disk, 214 

19.6 Wiener’s Tauberian theorem, 215 

19.7 Commutative G*-algebras, 221 

Historical note, 224 


(205 Examples of Operators and Their Spectra : 


20.1 Invertible maps, 226 
Boundary points of the spectrum 


226 


20.2 Shifts, 229 


20.4 The Fourier transform, 231 


ae Maps 


i “21:1 ~ Basic properties of compact maps, 233 
Compact maps form a two-sided ideal—Identity plus 
compact map has index zero 


233 


Pde. 


CONTENTS 


21.2 The spectral theory of compact maps, 238 
‘The transpose of a compact operator is compact—The Fredholm 
alternative 

Historical note, 244 


„Examples of Compact Operators 2S 


22-1-—Compaciness criteria, 245 

~~ Arzela-Ascoli compactness criterion—Rellich compactness 
criterion 

22.2 Integral operators, 246 _ 


23. 


24. 


25. 


26. 


Hilbert-Schmidt operators 
22.3 The inverse of elliptic partial differential operators, 249 
22.4 Operators defined by parabolic equations, 250 
22.5 Almost orthogonal bases, 25 1 o 


Positive compact operators E 253 


23.1 The spectrum of compact positive operators, 253 
23.2 Stochastic integral operators, 256 

Invariant probability density 
23.3 Inverse of a second order elliptic operator, 258 


Fredholm’s Theory of Integral Equations 260 
24.1 The Fredholm determinant and the Fredholm resolvent, 260 
The spectrum of Fredholm operators—A trace formula for 
Fredholm operators 
24.2 The multiplicative property of the Fredholm determinant, 268 
24.3 The Gelfand-Levitan-Marchenko equation and Dyson’s 
formula, 271 _ 
Invariant Subspaces 275 
25.1 Invariant subspaces of compact maps, 275 
The von Neumann-Aronszajn-Smith theorem 
25.2 Nested invariant subspaces, 277 
Ringrose’s theorem—Unicellular operators: the 
Brodsky-Donoghue theorem—The Robinson-Bernstein and 
Lomonsov theorems—Enflo’s example 
Harmonic Analysis on a Halfline 284 


26.1 The Phragmén-Lindeléf principle for harmonic functions, 284 


CONTENTS 


27. 


28. 


30. 


26.2 An abstract Pragmén-Lindel6f principle, 285 
Interior compactness 
26.3 Asymptotic expansion, 297 


Solutions of elliptic differential equations in a half-cylinder 


Index Theory 


27.1 The Noether index, 301 


Pseudoinverse—Stability of index—Product formula— 
H6érmander’s stability theorem 


Historical note, 305 
27.2 Toeplitz operators, 305 


Index-winding number—The inversion of Toeplitz operators— 


Discontinuous symbols—Matrix Toeplitz operators 
27.3 Hankel operators, 312 


Compact Symmetric Operators in Hilbert Space 


Variational principle for eigenvalues—Completeness of 
eigenfunctions—The variational principles of Fisher and 
Courant—Functional calculus—Spectral theory of compact 


normal operators—Unitary operators 


Examples of Compact Symmetric Operators 


29.1 Convolution, 323 


29.2 The inverse of a differential operator, 326 
29.3 The inverse of partial differential operators, 327 


Trace Class and Trace Formula 


30.1. Polar decomposition and singular values, 329 
30:2 “Trace-class;trace-norm,and-trace; 33Q—— 


xi 


300 


315 


323 


329 


30.3 The trace formula, 334 
Weyl’s inequalities—Lidskii’s theorem 
30.4 The determinant, 341 


30:5 - Examples-and-counterexamples of trace class operators;-342———" = 


Mercer’s theorem—tThe trace of integral operators—A Volterra 


integral operator—The trace of the powers of an operator 


seems Mig trex traces eerste seer em ANTEE creme eee recente sega eee 


xii CONTENTS 
30.6 The Poisson summation formula, 348 
7 7 Convolution on S! and the convergence of Fourier series—The 
Selberg trace formula 
30.7 How to express the index of an operator as a difference of 
tries. -B49 o a o o o ; 
aramis ahe Hilbert-Schmidt class, 352. - - 
RetatiomofHilbert-Sehmidt-elass-and-trace-elaSs--——-—---- ---—---—-—- -= 
30.9 ` Determinant and trace for operator in Banach spaces, 353 
31. Spectral Theory of Symmetric, Normal, and Unitary Operators 354 
31.1 The spectrum of symmetric operators, 356 
_. Reality of spectrum—Upper and lower bounds for the 
spectrum— Spectral radius 
31.2 Functional calculus for symmetric operators, 358 
The square root of a positive operator—Polar decomposition of 
bounded operators 
31.3 Spectral resolution of symmetric operators, 361 
Projection-valued measures 
31.4 Absolutely continuous, singular, and point spectra, 364 
31.5 The spectral representation of symmetric operators, 364 
Spectral multiplicity—Unitary equivalence 
31.6 Spectral resolution of normal operators, 370 
Functional calculus—Commutative B*-algebras 
31.7 Spectral resolution of unitary operators, 372 
Historical note, 375 
32. Spectral Theory of Self-Adjoint Operators 377 


32.2 
32.3 


The Hellinger-Toeplitz theorem—Definition of 
self-adjointness—Domain 


Spectral resolution, 378 

Sharpening of Herglotz’s theorem—Cauchy transform of 
measures—The spectrum of a self-adjoint operator— 
Representation of the resolvent as a Cauchy transform— 
Projection-valued measures 


Spectral resolution using the Cayley transform, 389 


A functional calculus for self-adjoint operators, 390 


CONTENTS 


33. Examples of Self-Adjoint Operators 


33.1 The extension of unbounded symmetric operators, 394 
Closure of a symmetric operator 


33.2 Examples of the extension of symmetric operators; deficiency 
indices, 397 
The operator i(d/dx) on ci (R), ci (R+), and ci (0, 1)— 
Deficiency indices and von Neumann’s theorem—-Symmetric 
operators in a real Hilbert space 


33.3 The Friedrichs extension, 402 


Semibounded symmetric operators—Symmetric ODE— 
Symmetric elliptic PDE 


33.4 The Rellich perturbation theorem, 406 


Self-adjointness of Schrödinger operators with singular 
potentials 


33.5 The moment problem, 410 


The Hamburger and Stieltjes moment problems—Uniqueness, or 
not, of the moment problem 


Historical note, 414 


34. Semigroups of Operators 


34.1 Strongly continuous one-parameter semigroups, 418 
Infinitesimal generator—Resolvent—Laplace transform 
34.2 The generation of semigroups, 424 
The Hille-Yosida theorem 
34.3 The approximation of semigroups, 427 


The Lax equivalence theorem—Trotter’s product formula— 
Strang’s product formula 


34.4 Perturbation of semigroups, 432 
Lumer-Phillip’s theorem—Trotter’s perturbation theorem~ ~ 
34.5 The spectral theory of semigroups, 434 


Phillip’s. spectral mapping theorem—Adjoint semigroups— 
Semigroups of eventually compact operators 


xiii 


394 


416 


35. Groups of Unitary Operators 


35.1 Stone’s theorem, 440 


Generation of unitary groups—Positive definiteness and 
Bochner’s theorem 


35.2 Ergodic theory, 443 
von Neumann’s mean ergodic theorem 


440 


if> 


xiv CONTENTS 


35.3 The Koopman group, 445 


Volume-preserving flows—Metric transitivity—Time 
average—Space average 


35.4 The wave equation, 447 
- --——--—-In-full-space-time—In the exterior of an obstacle 
AA 35.5.. Translation representation, 448 = 
Sinai’s thearem—Incoming-subspaces—-Solution-of wave——..—...-.-.-.----—. 


__€quation in odd number of space dimensions—Wave propagation 
outside an obstacle 
35.6 The Heisenberg commutation relation, 455 


The uncertainty principle—Weyl’s form of the commutation 
relation—von Neumman’s theorem on pairs of operators that 
satisfy the commutation relation 


Historical note, 459 


36. Examples of Strongly Continuous Semigroups 461 


36.1 Semigroups defined by parabolic equations, 461 

36.2 Semigroups defined by elliptic equations, 462 

36.3 Exponential decay of semigroups, 465 

36.4 The Lax-Phillips semigroup, 470 

36.5 The wave equation in the exterior of an obstacle, 472 


37. Scattering Theory 477 


37.1 Perturbation theory, 477 

37.2 The wave operators, 480 

37.3 Existence of the wave operators, 482 

37.4 The invariance of wave operators, 490 

37.5 Potential scattering, 490 

37.6 The scattering operator, 491 

Historical note, 492 

37.7 The Lax-Phillips scattering theory, 493 

37.8 The zeros of the scattering matrix, 499 

37.9 The automorphic wave equation, 500 
Faddeev and Pavlor’s theory—The Riemann hypothesis 


38. A Theorem of Beurling 513 
' 38.1 The Hardy space, 513 
38.2 Beurling’s theorem, 515 


Inner and outer factors—Factorization in the algebra of bounded 
analytic functions 


CONTENTS 


38.3 The Titchmarsh convolution theorem, 523 
Historical note, 525 


Texts 


A. Riesz-Kakutani representation theorem 


A.l Positive linear functionals, 529 

A.2 Volume, 532 

A.3 Lasa space of functions, 535 

A.4 Measurable sets and measure, 538 

A.5 The Lebesgue measure and integral, 541 


B. Theory of distributions 


B.1 Definitions and examples, 543 
B.2 Operations on distributions, 544 
B.3 Local properties of distributions, 547 


B.4 Applications to partial differential equations, 554 


B.5 The Fourier transform, 558 
B.6 Applications of the Fourier transform, 568 
B.7 Fourier series, 569 


C. Zorn’s Lemma 
Author Index 


Subject Index 


Xv 


529 


573 


577 


ioe 


FOREWORD 


This book grew out of a course of lectures on functional analysis taught over many 
years to second-year graduate students at the Courant Institute of New York Uni- 
versity. It is a graduate text, not a treatise or a monograph. Most of the chapters are 
short, for it is easier to digest material in small chunks. Not all topics can be pre- 
sented briefly, so some of the chapters are longer. Theorems and lemmas, as well as 
equations, are numbered consecutively in each chapter. 

The first 23 chapters make only a modest technical demand on the reader; this 
material would serve very well as text for an introductory graduate course on func- 
tional analysis. The rest of the material is well suited as text for a more advanced 
graduate course on functional analysis in general, or on Hilbert space in particular. 

When I was a student, the only text on functional analysis was Banach’s original 
classic, written in 1932; Hille’s book appeared in time to serve as my graduation 
present. For Hilbert space there was Stone’s Colloquium publication, also from 1932, 
and Sz.-Nagy’s Ergebnisse volume. Since then, our cup hath run over; first came 
Riesz and Sz.-Nagy, then Dunford and Schwartz, Yosida, later Reed and Simon, and 
Rudin. For Hilbert space, there was Halmos’s elegant slender volume, and Achiezer 
and Glazman, all of which I read with pleasure and profit. Many, many more good 
texts have appeared since. Yet I believe that my book offers something new: the 
order in which the material is arranged, the interspersing of chapters on theory with 
chapters on applications, so that cold abstractions are made flesh and blood, and 
the inclusion of a very rich fare of mathematical problems that can be clarified and 
solved from the functional analytic point of view. 

In choosing topics I heeded the warning of my teacher Friedrichs: “It is easy to 
write a book if you are willing to put into it everything you know about the sub-. 


ject.” I present the basic structure of the subject, and those more advanced topics that 


loom large in the body of mathematics. Among these are the’spectral resolution and 
spectral representation of self-adjoint operators, the theory of compact operators, the 
Krein-Milman theorem, Gelfand’s theory of commutative Banach algebras, invari- 
ant subspaces, strongly continuous one-parameter semigroups. I discuss the index 
of operators, so important in calculating topological invariants; the celebrated trace 


formula of Lidskii, a powerful tool in analysis; the Fredholm determinant and its 
generalizations, rising again after almost a hundred years of hibernation; and scatter- 


xviii FOREWORD 


ing theory, another gift from physics to mathematics. I-have also included some (but 
not all) special topics close to my heart. 

~ What has been omitted? All of nonlinear functional analysis, for which I recom- 
mend the four-volume treatise by Zeidler. Operator algebras, except for Gelfand’s 
theory of commutative Banach algebras. I slight the geometric theory of Banach 
spaces; happily a handbook-on-this subject, edited-by Bill Johnson and Joram Lin- 
denstrauss is about to be published by North Holland, 


~~ What are the prerequisites? What every second-year graduate student—and many __ 


undergraduates—knows: 


o Naive set theory. Denumerable sets, the continuum, Zorn’s lemma. 


o. Linear algebra. The alternative for linear maps, trace and determinant of a 
matrix, the spectral theory of general and symmetric matrices, functions of a 
matrix. 


o Point set topology. Complete metric spaces, the Baire category principle, Haus- 
dorff spaces, compact sets, Tychonov’s theorem. 


e Basic theory of functions of a complex variable. 


e Real variables. The Arzela-Ascoli theorem, the Lebesgue decomposition of 
measures on R, Borel measure on compact spaces. 


It is an accident of history that measure theory was invented before functional 
analysis. The usual presentations of measure theory fail to take advantage of the con- 
cepts and constructions of functional analysis. In an appendix on the Riesz-Kakutani 
representation theorem I show how to use the tools of functional analysis in measure 
theory. Another appendix summarizes the basic facts of Laurent Schwartz’s theory 
of distributions. l a 

Many of the applications are to problems of partial differential equations. Here a 
nodding aquaintance with the Laplace and the wave equation would help, although 


-an-alert uninformed reader could pick up some of the basic results from these pages. 


Like most mathematicians, I am no historian. Yet I have included historical re- 
marks in some of the chapters, mainly where I had some firsthand knowledge, or 
where conventional history has been blatantly silent concerning the tragic fate of 
many of the founding fathers of functional analysis during the European horrors of 
the 1930s and 1940s. 

I am indebted to many. I learned the rudiments of functional analysis, and how 
to apply them, from my teacher Friedrichs. Subsequently my views were shaped by 
the work of Tosio Kato, who has brought the power of functional analysis to bear 
on an astonishing range of problems. My happy and long collaboration with Ralph 
Phillips has led to some unusual uses of functional analysis. I learned much from 
Israel Gohberg, especially about the index of Toeplitz operators, from Bill Johnson 
about the fine points of the geometry of Banach spaces, and from Bob Phelps about 
Choquet’s theorem. I thank Reuben Hersh and Louise Raphael for their critique of 
the appendix on distributions, and Jerry Goldstein for his expert comments on the 


FOREWORD xix 


material on semigroups and scattering theory. To all of them, as well as to Gabor 
Francsics, my thanks. 

Jerry Berkowitz and I alternated teaching functional analysis at the Courant In- 
stitute. This would be a better book had he lived and looked the manuscript over 
critically. 

I thank Jeff Rosenbluth and Paul Chernoff for a careful reading of the early chap- 
ters, and Keisha Grady for TgXing the manuscript, and cheerfully making subsequent 
changes and corrections. 

The lecture course was popular and successful with graduate students of the 
Courant Institute. I hope this printed version retains the spirit of the lectures. 


PETER D. LAX 


New York, NY 
November 2001 


w 
aye 


Cey 

Rs 
i 
` 


i 


LINEAR SPACES 


A linear space X over a field F is a mathematical object in which two operations 
are defined: addition and multiplication by scalars. 
Addition, denoted by +, as in 


x+y (1) 
is assumed to be commutative, 
x+y=y+x, (2) 
associative, 
x+(ytz)=@t+y)+2%, (3) 


and to form a group, with the neutral element denoted as 0: 
x+O0=x. (4) 
The inverse of addition is denoted by —: 


x+(cxJex—x=0. 65) 


The second operation is the multiplication of elements of X ‘by elements k of the 
field F: 


kx. 


ee The result of this multiplication is again an element of X. Multiplication by elements 
~ of F is assumed to be associative, 


k(ax) = (ka)x, (6) 


~ “and distribiitivé, ey 


k(x + y) = kx +ky (7) 


2 l LINEAR SPACES 
as well as 
(a+b)x = ax + bx. (8) 


We assume that multiplication by the unit of F, denoted as 1, acts as the identity: 


De ee eet tia EE EAEE Ca y EE 


These are the axioms of linear algebra. From them proceed to.draw. some deduc- 
tions. 
Set b = 0 in (8). It follows that for all x, 


Ox = 0. . (10) 
Seta = 1, b = —1 in (8). Using (9) and (10), we deduce that for all x, 

(—1)x = —x. (11) 

The finite-dimensional linear spaces are dealt with in courses on linear algebra. In 

this book the emphasis is on the infinite-dimensional ones—those that are not finite- 

dimensional. The field F will be either the real numbers R or the complex numbers 


C. Here are some examples. 


Example 1. X is the space of all polynomials in a single variable s, with real coeffi- 
cients, here F = R. l 


Example 2. X is the space of all polynomials in N variables Sj,...,5N, With real 
coefficients, here F = R. 


Example 3. G is a domain in the complex plane, and X the space of all functions 
complex analytic in G, here F = C. 


Example 4. X = space of all vectors 
x = (a41, đ&2,...) 
with infinitely many real components, here F = R. 


Example 5. Q is a Hausdorff space, X the space of all continuous real-valued func- 
tions on Q, here F = R. 


Example 6. M is a C°° differentiable manifold, X = C®%(M), the space of all 
differentiable functions on M. 


Example 7. Q is a measure space with measure m, X = L! (Q, m). 


LINEAR SPACES 3 
Example 8. X = LP (Q, m). 
Example 9. X = harmonic functions in the upper half-plane. 


Example 10. X = all solutions of a linear partial differential equation in a given 
domain. ; 


Example 11. All meromorphic functions on a given Riemann surface; F = C. 


We start the development of the theory by giving the basic constructions and con- 
cepts. Given two subsets S and T of a linear space X, we define their sum, denoted 
as S +T to be the set of all points x of the form x = y + z, y in S, z in T. The 
negative of a set S, denoted as —S, consists of all points x of the form x = —y, y 
in S. 

Given two linear spaces Z and U over the same field, their direct sum is a linear 
space denoted as Z @ U, consisting of ordered pairs (z, u), z in Z, u in U. Addition 
and multiplication by scalars is componentwise. 


Definition. A subset Y of a linear space X is called a linear subspace of X if sums 
and scalar multiples of Y belong to Y. 


Theorem 1. 


(i) The sets {0} and X are linear subspaces of X. 
(ii) The sum of any collection of subspaces is a subspace. 
(iii) The intersection of any collection of subspaces is a subspace. 


(iv) The union of a collection of subspaces totally ordered by inclusion is a sub- 
space. 


Exercise 1. Prove theorem 1. 


Let S be some subset of the linear space X. Consider the collection {Yo} of all 
linear subspaces that contain the set S. This collection is not empty, since it certainly 
contains X. 


Definition. The intersection NY, of all linear subspaces Yq containing the set S is 


called the linear span of the set S. 


Theorem 2. 


(i) The linear span of a set S is the smallest linear subspace containing S. 


__ _ (ii) The linear span of S consists of all elements x of the form 


5 . 
y= X ax, x; € S, a; € F, n any natural number. (12) 
: ; 


& 


4 LINEAR SPACES 


Proof. Part (i) is merely a rephrasing of the definition of linear span. To prove 


part (ii), we remark that on the one hand, the elements of the form (12) form a linear — 


subspace of X; on the other hand, every x of form (12) is contained in any subspace 
Y containing S. 


REMARK 1. An element x of form (12) is called a linear combination of the points 


X1, -+ , Xn. So theorem 1 can be restated as follows: _. 


The linear span of a subset S of a linear space consists of all linear combinations 


of elements of S. 


Definition. X a linear space, Y a linear subspace of X. Two points x1..and x2_of X. 


are called equivalent modulo Y, denoted asx} = x2 (mod Y), if xy — x2 belongs 
to Y. 


It follows from the properties of addition that equivalence mod Y is an equiva- 
lence relation, meaning that it is symmetric, reflexive, and transitive. That being the 
case, we can divide X into distinct equivalence classes mod Y. We denote the set of 
equivalence classes as X/Y. The set X/Y has a natural linear structure; the sum of 
two equivalence classes is defined by choosing arbitrary points in each equivalence 
class, adding them and forming the equivalence class of the sum. It is easy to check 
that the last equivalence class is independent of the representatives we picked; put 
differently, if x] = z1, x2 = zo, then xj + x2 =z) + z2 mod F. Similarly we define 
multiplication by a scalar by picking arbitrary elements in the equivalence class. The 
resulting operation does not depend on the choice, since, if x} = z4, then kx] = kz 
mod Y. The quotient set X/Y endowed with this natural linear structure is called the 
quotient space of X mod Y. We define codim Y = dim X/Y. 


Exercise 2. Verify the assertions made above. 


As with all algebraic structures, so with linear structures we have the concept of 
isomorphism. 


Definition. Two linear spaces X and Z over the same field are isomorphic if there 


is a one-to-one correspondence T carrying one into the other that maps sums into 
sums, scalar multiples into scalar multiples; that is, 


Tœ; +22) = Tax) + Tx), 
T(kx) = kT (x). (13) 


We define similarly homomorphism, called in this context a linear map. 


Definition. X and U are linear spaces over the same field. A mapping M : X — U 
is called linear if it carries sums into sums, and scalar multiples into scalar multiples; 


LINEAR SPACES 


nm 


that is, if for all x, y in X and all k in F 


M(x +y) = M(x) +My), 
M(kx) = kKM(x). (14) 


X is called the domain of M, U its target. 


REMARK 2. An isomorphism of linear spaces is a linear map that is one-to-one and 
onto. 


Theorem 3. 


(i) The image of a linear subspace Y of X under a linear map M : X —> U isa 
linear subspace of U. 


(ii) The inverse image under M of a linear subspace V of U is a linear subspace 
of X. 


Exercise 3. Prove theorem 3. 

A very important concept in a linear space over the reals is convexity: 
Definition. X is a linear space over the reals; a subset K of X is called convex if, 
whenever x and y belong to K, the whole segment with endpoints x, y, meaning all 
points of the form 

ax+(1—-a)y, O<a<l, (15) 


also belong to K. 


Examples of convex sets in the plane are the circular disk, triangle, and semicir- 
cular disk. The following property of convex sets is an immediate consequence of 
the definition: ` 


Theorem 4. Let K be a convex subset of a linear space X over the reals. Suppose 


that x|,..., Xn belong to K; then so does every x of the form 
te : 
EENE ETIE eae Meta Se E Nee DD Qr (16) 


- Exercise 4. Prove theorem 4. w 


An x of form (16) is called a convex combination of x1, x2, ..., Xn. 


6 LINEAR SPACES 


Theorem 5. Let X be a linear space over the reals. 


(i) The empty set is convex. 
(ii) A subset consisting of a single point is convex. 
(iii) Every linear subspace of X is convex. A, a eats 
(iv) The sum of two convex subsets is convex... 
(v) If K is convex, so is —K. ` 


(vi) The intersection of an arbitrary collection of convex sets is conyex. 


(vii) Let {K ;} be a collection of convex subsets that is totally ordered by inclusion. 
Then their union UK; is convex. 


v (viii) The image of a convex ser under a linear map iS Convex. 


(ix) The inverse image of a convex set under a linear map is convex. 


Exercise 5. Prove theorem 5. 


Definition. Let S be any subset of a linear space X over the reals. The convex hull 
of S is defined as the intersection of all convex sets containing S. The hull is denoted 
as S. 


Theorem 6. 


(i) The convex hull of S is the smallest convex set containing S. 
(ii) The convex hull of S consists of all convex combinations (16) of points of S. 


Exercise 6. Prove theorem 6. 
Definition. A subset E of a convex set K is called an extreme subset of K if: 


.-(i)--E-is convex-and-nonempty.. - 
(ii) whenever a point x of E is expressed as 


HRZ ; 
= —, zin kK, 
then both y and z belong to E. 
An extreme subset consisting of a single point is called an extreme point of K. 


Example I. K is the interval 0 < x < 1; the two endpoints are extreme points. 


Example 2. K is the closed disk 
x+y? <i. 


Every point on the circle x? + y? = 1 is an extreme point. 


LINEAR SPACES 7 
Example 3. The open disk 

x? + y? <1 
has no extreme points. 


Example 4. K a polyhedron, including faces. Its extreme subsets are its faces, edges, 
vertices, and of course K itself. 


Theorem 7. Let K be a convex set, E an extreme subset of K, and F an extreme 
subset of E. Then F is an extreme subset of K. 


Exercise 7. Prove theorem 7. 

Theorem 8. Let M be a linear map of the linear space X into the linear space U. 
Let K be a convex subset of U, E an extreme subset of K. Then the inverse image of 
E is either empty or an extreme subset of the inverse image of K. 


Exercise 8. Prove theorem 8. 


Exercise 9. Give an example to show that the image of an extreme subset under a 
linear map need not be an extreme subset of the image. 


Taking U to be one dimensional, we get 
Corollary 8’. Denote by H a convex subset of a linear space X, £ a linear map 
of X into R, Amin and Hmax the subsets of H, where € achieves its minimum and 


maximum, respectively. 


Assertion. When nonempty, Hmin and Hmax are extreme subsets of H. 


2.1 ALGEBRA OF LINEAR MAPS 
We recall from chapter 1 that a linear map from one linear space X into another, U, 
both over the same field of scalars, is a mapping of X into U, 
M: X — U, 
that is an algebraic homomorphism: 
MG + y) = MQ) + MO), 
M(kxy= kM(x). 3 (1) 


In this section we explore those properties of linear maps that depend on the purely 
algebraic properties (1), without any topological restrictions imposed on the spaces 
X,U. 

The sum of two linear maps M and N of X into U, and the scalar multiple is 
defined as 


M+M) =M0)+NG), oe 
(KM) (x) = kM(x). (3) 


This makes a linear space out of the set of linear maps of X into U. The space 
is denoted as £(X, U). Given two linear maps, one, M from X — U, the other, N 
from U —> W, we can define their product as the composite map 


(NM) (x) = N(M(z)). (4) 


Since compositon of maps in general is associative, so is in particular the composi- 
tion of linear maps. As we will see, composition is far from being commutative. 
From now on we omit the bracket and denote the action of a linear map on x as 


M(x) = Mx. 


This notation suggests that the action of M on x is a kind of multiplication; indeed 
(1) and (2) give the distributive property of this kind of multiplication. 


8 
\- 


ALGEBRA OF LINEAR MAPS 9 


Exercise 1. Verify that the composite of two linear maps is linear, and that the dis- 
tributive law holds: 


M(N + K) = MN + MK, 
(M+ K)N =MN+KN. 


Definition. A mapping is invertible if it maps X one-to-one and onto U. 
If M is invertible, it has an inverse, denoted as M~!, that satisfies 
M'M=I, MM"! =I, 


where I on the left is the identity mapping in X, on the right on U. If M is linear, so 
is M7!. 


Definition. The nullspace of M, denoted by Nm, is the set of points mapped into 
zero. k 


The range of M, denoted by Ry, is the image of X under M in U. 


Theorem 1. Let M be a linear map of X —> U. 


(i) The nullspace Nym is a linear subspace of X, the range Rm a linear subspace 
of U. 


(ii) M is invertible iff Nm = {0} and Ry = U. 
(iii) M maps the quotient space X/ Nm one-to-one onto RM. 
(iv) IfM : X — U and Ķ : U — W are both invertible, so is their product, and 


(KM)! = M7! K7!, 


(v) If KM is invertible, then 


Exercise 2. Prove theorem 1. 


—_____We remark that when x =U =W are finite dimensional, then_the invertibility_of..... 


_.-...the.product NM implies that N and M separately are invertible. This is not so in the 
infinite-dimensional case; take, for instance, X to be the space of infinite sequences 


x = (aj, a9,...) 


and define R and L to be right and left shift! Rx = (0, a1, a2,...), Lx = 
(az, a3,...). Clearly, LR is the identity map, but neither R nor L are invertible; 
nor is RL the identity. 


ee 


10 LINEAR MAPS 


“We formulate now a number of useful notions and results concerning mappings 
of a linear space into itself: 


M: xX —> X. 


We denote by JV) the nulispace of the jth power of Mi 


dia ig NS te Cie a a SEN ee ei Ny p= N: Mi= k (5) 


Theorem 2. The subspaces N; defined in (5) have these properties: 


Nj CNj41 forall j (6) 
and l 7 7 
N; N; 
aim ( j ) > dim (AH) for all j. (7) 
date Ni 


Proof. Equation (6) is an immediate consequence of (5). To show (7), we claim 
that M maps Nj..;/Nj; into N;/Nj—; in a one-to-one fashion. To see this, note that 
a nonzero element of N;.)/N; is represented by a point z in N j+ that does not 
lie in N je Clearly, Mz lies in N j but not in N 'j—1; this shows the one-to-oneness. 
It follows that Nj+1/Nj is isomorphic to a subspace of N;/Nj—1, from which the 
statement (7) about dimension follows. When Nj+1/N; is infinite-dimensional, so 
is N;/Nj-1. o 


The following is an immediate corollary of equation (7): 
Theorem 2’. Suppose that for some i the subspaces defined by (5) Satisfy 
Ni = Nisa; (8) 
then 
Ni = Nk forallk >i. (8’) 


Definition. A subspace Y of X is called an invariant subspace of a linear map M: 
X — X ifM maps Y into Y. 


Theorem 3. Suppose that Y is an invariant subspace of X foramapping M: X — X. 
Then 


(i) there is a natural interpretation of M as a mapping X/Y —> X/Y. 
(ii) if both maps 


M:Y — YandM: X/Y — X/Y 


are invertible, so is M : X > X. 


ALGEBRA OF LINEAR MAPS 11 


Proof. We leave part (i) to the reader. In (ii) we show first that the null space of M 
on X is trivial. To see this, suppose that 


then, since the nullspace of M on X/Y is assumed to be trivial, it follows that z 
belongs to Y. But since the nullspace of M on Y also is trivial, it follows that z = 0. 
Next we show that M : X — X is onto, meaning that 


Mxo = uo (9) 


has a solution xo for every wo in X. To this end we solve equation (9) in two stages. 
First we solve the congruence 


` 


Mx = uo( mod Y), 


which is possible since M maps X/Y onto itself. Let x} be an element of the solution 
class; then x satisfies 


Mx; = u0 +z, zin Y. 
Therefore the solution x9 of (9) is 
xo =X Ys 
where y is the solution in Y of 


My =z. 


Such a solution exists since M is assumed to map Y onto Y. 


We remark that whereas invertibility of M on Y and X/Y guarantees the invert- 

ibility of M on X, the converse by no means holds in spaces of infinite dimension. 

For example, let X be the space of all bounded continuous functions on R, S the shift 

oo, sJOperatore ts 8 ee ate, i 


(Sx)(t) = x(t — 1), 


and Y the subs space of functions x(t) that vanish on the negative axis. Clearly, Y is 
“a _Shift invariant, ¢ and equally clearly, S is invertible on X, its inverse being the left unit | 


shift. But S is not invertible on either Ý or X/Y; on Y its range consists of functions 
x(t) that are zero for t < 1, and on X/Y it has a nontrivial nullspace. 


Exercise 3. What is the nullspace of S on X/Y? 


The construction of invariant subspaces will be taken up in chapter 25. Here we 
gather the following useful observations: 


& 


12 LINEAR MAPS 


Theorem 4. Let M be a linear map: X => X.: 
` (i) For any ) y: in in X, the set { {[pM)y}, where p represents any polynomial, is an 
invariant subspace of M. 
(ii) Let T be a linear map: X — X that commutes withM : TM = = MT. Then the 
nullspace of T is an invariant neSpare q Moe 


"Proof. ‘Part (i) i) rests on the observation that if PM ina calvncinal: sois MPM). a 


Part (ii) follows from the observation that if M and T commute, and if z is in the 
“nullspace of T : Tz = 0, then TMz = MTz = MO = 0. o 


2.2 INDEX OF A LINEAR MAP 


` The next group of theorems describe an important special class of mappings. 


Definition. A ia map G is called esas if its range is finite dimensional: 


dim Rg < œ. (10) 
Theorem 5. The degenerate maps form an ideal in the following sense: 


(i) The sum of two degenerate maps is degenerate. 

(ii) The product of a degenerate map with any linear map, in either order, is de- 
generate; that is, if G is degenerate, so are MG and GN, provided of course 
that the products can be defined. 


Exercise 4. Prove theorem 5. 


Definition. The linear maps M : X —> U and L : U -+ X are pseudoinverse to each 
other if a 
LM=1I+G, ML=I+G, (11) 


where I denotes the identity, G degenerate maps of X + X, and U > U, respec- 
tively. 


Exercise 5. Prove that the right shift and the left shift described after theorem 1 are 
pseudoinverses of each other on the space of all sequences. 


Theorem 6. 


(i) If L and M are pseudoinverses of each other, so are L + G; and M + Go, 
where G1, G2 are arbitrary degenerate maps. 

(ii) Suppose that M : X —> U and A: U —> W have pseudoinverses L and B, 
respectively. Then AM and LB are pseudoinverse to each other. 


INDEX OF A LINEAR MAP | . 13 
Exercise 6. Prove theorem 6. 
We recall the definition of codimension of a subspace R of a linear space U: 
codim R = dim(U/R). 
Theorem 7. A linear map M : X — U has a pseudoinverse if and only if 
dim Nm < œ, codim RM < œ. (12) 
Proof. For the “only if” part we use a lemma: 


Lemma 8. If G is a degenerate map of X — X, then 


dim Nj+g < œo, codim Rjg < œ. (13) 
Proof. For x in NIi4G, 
x+Gx=0 
This shows that 
Ni+e C Re; 


combined with (10) this shows the first part of (13). 
According to theorem 1 (iii), G maps X/Ng one-to-one onto Rg; so 


codim Ng = dim Rg. (14) 


Obviously I + G maps every x in Ng into itself; this shows that Ryg D Ng. It 
follows from this relation that 


__codim Ris < codim Ng. a4) 


Combining (14) and (14’), we conclude that codim Ri+G < dim Rg; using (10), we 


deduce the second part of (13). go 


Suppose now that M has a pseudoinverse; then (11) holds. From the first relation 


“in (11) we deduce that Nm C Nig and therefore dim Nyy < dim M6: combining 


this with the first part of (13), we obtain the first part of (12). It follows from the 
second relation in (11) that RM D RyG. Therefore 


codim Rm < codim Rises. 


Combining this with the second relation in (13), we deduce the second part of (12). 


“meaning that every x in X can be decomposed uniquelyas 


14 LINEAR MAPS 


For the “if” part we need: 


Lemma 9. Every subspace N of a linear space has a complementary subspace Y, 
namely a linear subspace Y of X such that 


X=N@OY, 


‘x=nt+y,~ meN,y ey. vat oN (15) 


Proof. Consider all subspaces Y of X whose intersection with N is {0}, partially 
“ordered by inclusion. Every totally ordered collection of Y; has as upper bound the 
union of the Y;. Zorn’s lemma shows that there is a maximal Y; this Y clearly has 
the property stated in the lemma. Now, if some x cannot be expressed of form (15), 
we could enlarge Y by adjoining x, contradicting the maximality of Y. 0 


Note that the complementary subspace Y is in no way uniquely determined. Hav- 
ing determined a particular Y, we define the projection P onto N from the decompo- 
sition (15): 

Px =n. 
Exercise 7. Prove that P is a linear map. 
Exercise 8. Show that when N has finite codimension, dim Y = codim N. 

We return now to the proof of the “if” part of theorem 7: it follows from (15) that 
every equivalence class of X mod N contains exactly one element belonging to Y, 
and that this correspondence is an isomorphism: 


YoX/N. 


Suppose that M : X —> U satisfies conditions (12); we choose complementary 
subspaces Y and V for the nullspace and range of M: 


X=Nm@Y, U=RmeV. (16) 


According to theorem 1 (iii), M maps X¥/Nm one-to-one onto RM. Since X/Ny is 
isomorphic with Y, we conclude that 


M: Y — RM 
is invertible. Denote its inverse by M7! and define the map K as follows: 


K=M~!onRy, K=OonV. (17) 


INDEX OF A LINEAR MAP 15 
Using (16), we can extend K to all of U. Clearly, 


_ | tI ony _ | I onRu ; 
KM = 0 onNy’ ae 0 oV ° (17) 


We can rewrite (17’) as follows: 
KM=I1-P, MK=I-Q, 


where P is projection onto N, Q projection onto V. It follows from this that K and M 
are pseudoinverse to each other in the sense of (11). Since P and Q are degenerate, 
the proof of theorem 7 is complete. o 


Definition. Let M : X — U be a linear map with a pseudoinverse. We define the 
index of such an M as 


ind M = dim Nm — codim Ry. (18) 
It follows from theorem 7 that this definition makes sense. 


Theorem 10. M : X > U and L : U —> W are linear maps with pseudoinverse. 
Then the product LM has pseudoinverse, and 


ind (LM) = ind L + ind M. (19) 


Proof. By theorem 6 (ii), LM has a pseudoinverse. To prove (19), we want to use 
as a counting device the notion of an exact sequence: 


Definition. A sequence of linear spaces Vp, Vi, ..., Va and a sequence of linear 
maps Tj g Vj > Vi4, 


Tht 


Ti T 
Vg v =... oO W, 


is called exact if the range of T; is the nullspace of Tj. 


Lemma 11. Suppose that all the V; in the exact sequence above are finite dimen- 
sional and that 


dim Vo = 0 = dim Vp. (20) 
Sei Pa eee ie ee ee Gained ge ee 
$1 dim Vj = 0. - (20’) 
j 


_ Proof: Decompose Vj as 


Vi =N;j@Y;, 


16 LINEAR MAPS 


~ where Ñ; is the nullspace of Tj and Y; complementary to Nj. The condition of 
___---exactness.requires that T; be an isomorphism of Y; with Nj1. Since dim Vj = 
dim Nj + dim Yj, it follows that 


dim Vj =dimNj+dimNjy1, OS j<n-1. (21) 
SAI scares ene reeinn a, 
dim No = o and dim Vij—1 = dim Np-1- ee ee 
Setting (21) and (21) in the left side of (20’) shows that the alternating sum is zero. 
To eae theorem 10, we construct the followin a exact sequence: i 
0 Ny Nim eN SURM W/ Rom? WR 0. e 


The mapping Ip identifies Nyy as a subspace of Num. Q is the natural map of points 
of U into the equivalence classes of U mod Ry containing them. E is the mapping 
of equivalence classes of W mod Rym into equivalence classes mod RL. 

Exercise 9. Verify that (22) is an exact sequence. 


We apply relation (20°) to the exact sequence (22), with 


Vo=0, Vi=Nm, Vo=Nim, V3 = ML, 
V4=U/RmM, Vs=W/Rim, Ve=W/Rit, V2 =0. 


Using the definition of codimension, we can write (20') as follows: 
dim Np — dim Nim + dim My, — codim RM + codim Rim = codim Ry, = 0. 


Using the definition (18) of the index, we deduce the product formula (19) for the 
index. g 


The next result is called the stability of index: 


Theorem 12. Let M : X — U be a linear map with a pseudoinverse, and G : X > 
U a degenerate linear map. Then M + G has a pseudoinverse, and 


ind (M + G) = ind M. (23) 
Proof. We first verify (23) for U = X and M = L. For this we need a lemma: 


Lemma 13. Let X be a linear space, and K : X —> U a linear map of X into U that 
has a pseudoinverse. Let Xo be a linear subspace of X that has finite codimension. 


INDEX OF A LINEAR MAP . 17 
Then Kg : Xo — U, the restriction of K to Xo, has a pseudoinverse, and 
ind Kọ = ind K — codim Xo. (24) 
Proof. Factor Ko as 
Ko = Klp, (24’) 
where Ig : Xo — X is the identification map. Clearly Ny, = {0}, Rt, = Xo, so 


‘indIp = —codim XQ. (25) 


Now we apply the product formula (19) to (24’) and deduce (24). E | 


Let G : X — X be a degenerate map; take K : X — X to be 
K=I+G. (26) 
Clearly, I is a pseudoinverse to K. Take Xọ to be the nullspace of G: 
Xo = NG. (27) 


By (14), Xo has finite codimension. Since G is zero on Xo, Ko, the restriction of K 
to Xo, is the identification map Ip. So by (25), 


ind Kọ = ind Iọ = —codim Xo. 
We apply now lemma 13 to K. By (24), 
ind Kọ = ind K — codim Xo. 
We deduce from the last two relations that 


indK = 0 (28) 


for every K of form (26). This proves (23) for M =I. 
We take now M as any map with a pseudoinverse; denote by L : U + Xa 
pseudoinverse of M. By definition, 


- G’ degenerate. So by (28), - 
ind (LM) = ind (I + G’) = 0. (29) 


Using the product formula-(19);-we get-from-(29)-that naa 


indL = —indM. (30) 


18 n LINEAR MAPS 


As we saw in theorem 6 (i), for degenerate G, L is also a pseudoinverse of M + G. 
Therefore, using (30), once more we deduce that 


ind L = —ind (M + G). (30°) 


__.....--_Combining (30) and (30'), we get (23). 


Notes. are ol Erne ee SES EE Ream aaa baat ea 


The first part of this chapter is standard fare. The nonstandard items are as follows: 


(i) The notion of the index of linear maps that have a pseudoinverse, theorem 7. 
(ii) The product formula for the index, theorem 10. 


(iii) The invariance of the index under perturbation by degenerate maps, theo- 
rem 12. 


Strange to say, these results of linear algebra were first discovered in the setting 
of bounded maps of normed linear spaces. That they hold without any topological 
assumptions has remained a folk theorem. The first statement and proof of the multi- 
plicative property in print is due to Donald Sarason. The proof presented here, using 
exact sequences, is due to Sergiu Klainerman. 


BIBLIOGRAPHY 


Sarason, D. The multiplication theorem for Fredholm operators. Am. Math. Monthly, 94 (1987): 68-70. 


3 


THE HAHN-BANACH 
THEOREM 


3.1 THE EXTENSION THEOREM 


The result named in the title of this chapter is remarkable for its simplicity and for 
its far-reaching consequences. It deals with the extension of linear functionals. 


Definition. A linear functional £ is a mapping of a linear space X over a field F into 
F, that is linear: 


E(x + y) = L(x) + £(y) 
for all x, y in X and 
(kx) = k£(x) 


for all k in F. 


In this section we will mainly deal with linear spaces over the field of reals, and 
real number valued linear functionals. . 


Theorem 1 (Hahn-Banach Theorem). Let X be a linear space over the reals, and 
p a real-valued function defined on X, which has the following properties: 


~~ (1) Positive homogeneity, 


Plax) =ap(x)___ foralla > 0 
for every x in X. 
(ii) Subadditivity, 


pixty)<pi)t+py O 


for all x, yin X. 


20 -THE HAHN-BANACH THEOREM 


Y denotes a linear subspace of X on which a linear functional £ is defined that is 
dominated by p: eee 


L(y) < p(y)  forallyinY. (3) 
Assertion. £ can be extended to all of X as a linear functional dominated by p: 


tœ) < < p(x) “forall x in X. a C D 


Proof. Suppose that Y is not all of X; then there is some z in. X that is not i in F. 
Denote by Z the linear span of Y and z, meaning all points of the form 


y+az, yin Y;ain. Ru... -. ibaa weet 


Our aim is to extend £ as a linear functional to Z so that (3’) is satisfied for x in Z. 
that is, 


L(y + az) = £(y) + a£(z) < p(y +z) 


holds for all y in Y and all real a. By (3), the inequality holds for a = 0. Since p is 
positive homogeneous, it suffices to verify it fora = +1: 


£(y) + £@) < piy+z), £0") eE) < pW’ -2). . 
Thus for all y, y’ in Y, 


LON — pO’ — 2) < £@) < pO +2) - £0) (4) 
must hold. Such an £(z) exists iff for all pairs y, y’, 
£(y') — p(y’ — z) < pO +z) - £0). 6) 


This is the same as 


LYH = 20" + y) S py +2) 4+ pO -2). 5) 
Since y + y’ lied in Y, (3) holds: 
LO +y) < pO +y’). (6) 
By subadditivity, 
PO +y) =pO +z +y =z) < pO +z) + pO’ —2). (7) 


Combining (6) and (7) gives (5’), proving the possibility of extending £ from Y to Z. 
So (3’) remains satisfied. 
Consider all extensions of £ to linear spaces Z containing Y on which inequality 
(3’) continues to hold. We order these extensions by defining 
(Z, £) < (Z', 2’) 


to mean that Z’ contains Z, and that £’ agrees with £ on Z. 


GEOMETRIC HAHN-BANACH THEOREM 21 


Let {Z,, y} be a totally ordered collection of extensions of £. Then we can define 
£ on the union Z = UZ, as being £, on Zy. Clearly, £ on Z satisfies (3’); equally 
clearly, (Zy, 2y) < (Z, £) for all v. This shows that every totally ordered collection 
of extensions of £ has an upper bound. So the hypothesis of Zorn’s lemma is sat- 
isfied, and we conclude that there exists a maximal extension. But according to the 
foregoing, a maximal extension must be to the whole space X. o 


3.2 GEOMETRIC HAHN-BANACH THEOREM 

In spite (or perhaps because) of its nonconstructive proof, the HB theorem has plenty 
of very concrete applications. One of the most important is to separation theorems 
concerning convex sets; these are sometimes called geometric Hahn-Banach theo- 


rems. 


Definition. X is a linear space over the reals, S a subset of X. A point xo is called 
an interior point of S if for any y in X there is an €, depending on y, such that 


xXottyeS for all real t, |t| < €. 


Let K be a convex set that has an interior point, which we take to be the origin. 
We denote the gauge px of K with respect to the origin as follows: 


pe(x)=infa a>0,~eXK. (8) 
a 
Since the origin is assumed to be an interior point of K, 


PK(x) < co 


for every x. 


Theorem 2. The gauge px of a convex set K in a linear space over the reals is 
positive homogeneous and subadditive. 


Proof. Positive homogeneity follows from the definition (8), even when K is not 


convex. To prove subadditivity, let x and y be any pair of points in X, a and b positive. 
numbers such that... 


x y = : 
Bers —eéeK, (9) 


b 


Convexity, as defined in chapter 1, means that any convex combination of points of K 
belongs to K. We take the convex combination of x/a and y/b with weights a/ (a+b) 
and b/(a + b). These are nonnegative numbers whose sum is 1. We conclude that 


a x b y x+y 


a+ba a+bb a+b 


te 


22 THE HAHN-BANACH THEOREM 


Since (x + y)/ (á +b) is in K, by definition (8), px (x+y) < a+b. Since this holds 
for all a and b satisfying (9), 


pK(x + y) < inf(a +b) = infa + infb = px (x) + pK O), 


where in the last step we have again used (8). This proves subadditivity of px. l O 


Theorem 3. For any convex set K, ek Se ae 


px(x)<1  ifxek, (10) 


PK(X) < 1 if x is an interior point of K. a0) > 
Proof. (10) is an immediate consequence of definition (8) of px. ze 


Exercise 1. Prove (10^). 
The converse of theorem 3 also is true: 


Theorem 4. Let p denote a positive homogeneous, subadditive function defined on 
a linear space X over the reals. 


(i) The set of points x satisfying 


p(x) <1 


is a convex subset of X, and 0 is an interior point of it. 
(ii) The set of points x satisfying 


px) <1 
is a convex subset of X. 


Exercise 2. Prove theorem 4. 


We turn now to the notion of a hyperplane. Suppose that £ is a linear functional 
not = 0; for any real c, all points of X belong to one, and only one, of the following 
three sets: 


L(x) <c, L(x)=c, f(x) >. 
The set of x that satisfies 
(x) =c 


is called a hyperplane; the sets where £(x) < c, respectively £(x) > c are called 
open halfspaces. The sets where 


GEOMETRIC HAHN-BANACH THEOREM 23 


t(x) >c, or E(x) <e, 
are called closed halfspaces. 


Theorem 5 (Hyperplane Separation Theorem). Let K be a nonempty convex sub- 
set of a linear space X over the reals; suppose that all points of K are interior. Any 
point y not in K can be separated from K by a hyperplane £(x) = c; that is, there is 
a linear functional £, depending on y, such that 


&(x) <c forallxin K, &(y)=c. (11) 


Proof. Assume that 0 € K, and denote by px the gauge of K. Since all points of 
K are interior, it follows from theorem 3 that px (x) < 1 for every x in K. We set 


f(y) = I. (12) 
Then £ is defined for all z of the form ay, 
£(ay) =a. (12') 
We claim that for all such z, 
e(z) < px). 


This is obvious for a < 0, for then £(z) < 0 while pg > 0. Since y is notin K, by 
(8), pK (y) = 1. So, by positive homogeneity, px (ay) > a fora > 0. 

Having shown that £, as defined on the above one-dimensional subspace, is dom- 
inated by px, we conclude from the HB theorem that £ can be so extended to all of 
X. We deduce from this and (10’) that for any x in K, 


E(x) < p(x) <1 


This gives the first part of (11), with c = 1; the second part is (12). i g 


Corollary 5’. Let K denote a convex set with at least one interior point. For an < 
p yy 


not in K there is a nonzero linear functional £ that satisfies 


bx) < Ly) forallxinK, (13) 


` Theorem 6 (Extended Hyperplane Separation). X is a linear space over R, H, ~ 


and M disjoint convex subsets of X; atleast one of which has an interior point. Then 
H and M can be separated by a hyperplane £(x) = c; that is, there is a nonzero 
linear functional €, and a number c, such that 


Lise tig tt eS Ss sae 


forallu in H, all v in M. 


D 


24 ; THE HAHN-BANACH THEOREM 


Proof. According to theorem 5 of chapter-1, the difference set H — M = K is 
convex; since either H or M contains an interior point, so does K. 

Since H and M are disjoint, 0 g- K; according to (13) of corollary 5’ applied to 
y = 0, there is a linear functional £ such that 


£(x) < £(0) = 0 forall x in K. (15) 


Since all x in K = H — M is of the form x =u—v,uinH,vinM, (ia) leans tie 


POETON 


(14) follows from this, with c = supe £(u). 0 


3.3 EXTENSIONS OF THE HAHN-BANACH THEOREM 


The following extension of the H-B theorem, due to R. P. Agnew and A. P. Morse, is 
both useful and beautiful: 


Theorem 7. Let X denote a linear space over the reals and A be a collection of 
linear maps Ay : X — X that commute; that is, 


AvAu = AyAy (16) 


for all pairs in the collection. Let p denote a real-valued, positive homogeneous, 
subadditive function on X—see (1) and (2)—that is invariant under each Ay: 


p(Ayx) = p(x). (17) 


Let Y denote a linear subspace of X on which a linear functional £ is defined, with 
the following properties: 


(i) £ is dominated by p, namely . 


L(y) < p(y) (18) 


foreveryyinY. 7 
(ii) Y is invariant under each mapping A, namely 

foryinY, AyinY. (19) 
(iii) £ is invariant under each mapping A, namely 


(Ay) = L(y)  foryinY. (19) 


Assertion. £ can be extended to all of X so that £ is dominated by p in the sense of 
(18), and is invariant under each mapping Ay. 


EXTENSIONS OF THE HAHN-BANACH THEOREM 25 


Proof. If (17) holds for two mappings A and B of the collection A, it also holds 
for their product AB, defined as their composite. Similarly, if (19) and (19’) hold for 
A and B, they hold for the product AB. Likewise, if A and B commute with all Ay, 
so does their product. Thus we may adjoin to the collection A any finite products and 
the identity I. This enlarged collection will now form a semigroup. Then, if A and B 
belong to it, so does their product AB. From now on we assume that the collection 
A is a semigroup under multiplication. 

We define a new function g on X as follows: 


g(x) = inf p(Cx), (20) 
with C a convex combination of mappings in A, namely maps of the form 


C= > ajay aj =0,) aj =1,A; in A. 


Since A is a semigroup, the product of two convex combinations of mappings in A 
is also a convex combination. 
Using subadditivity, homogeneity, and invariance (17), we deduce that 


p(Cx) = p (Z ajAjx) SD ajp(Aja) = po). (21) 
Since in (20) we may take C to be the identity, it follows that 


g(x) < p(x). (21) 


Since p is positive homogeneous, it follows from (20) that so is g. We show next that ——---------- 


g is subadditive. 
Let x and y be arbitrary elements of X. By definition (20), for any €e > 0 there are 
maps C and D in the convex hull of A such that 


P(Cx) sg) +e, pDy)<s)+e > > (22) 
Applying (20) to the map CD, we get, since C and D commute, that 
g(x +y) < p(CD@ + y)).= p(DCx + CDy). (23) 


Using subadditivity, and (21), the right side of (23) is seen to be less than 


So ep DER) PCDs epc D ____0 


Using (22) to estimate (24), we conclude that l 


g(xt+y) <glx)+ gly) + 2: 


since € is arbitrary, subadditivity of g follows. 
Since, by (19°), 2 on Y is invariant under each A, for any convex combination C ` 
of mappings in A and for any y in Y, 


26 THE HAHN-BANACH THEOREM 


Uy) =8 os ajAjy) = ajta jy) z Sato) = L(y). 


It follows from (19) that if y belongs to Y, so does Cy. Applying (18) to Cy, we get 
that for y in Y, 


£(Cy) < p(Cy). 


Since we have shown that £(Cy) = £(y), 


£(y) < p(Cy); 
by definition (20) of g, it follows from this that for all y in Y, 
£(y) < ay). (25) 


We apply now the Hahn-Banach theorem to conclude that £ can be extended to all 
of X so that (25) holds. We claim that £ thus extended is invariant under all mappings 
A in A in the sense of (19). For any A in A and any natural number n, we define Cn 
by C, = i se AÏ., Since Aisa semigroup, C, belongs to the convex hull of A. 
According to the basic formula for geometric series, C, I — A) = id — A"), 

Let x be any point in X; by definition (20) of g, 


1 
g(x ~ Ax) < p(Cr (x — Ax) = (Cn — A)x) = = p(x — A"x). (26) 


In the last step we used the formula for geometric series, and the positive homogene- 
ity of p. Using subadditivity and (17), we deduce that 


1 1 1 
PO —A"x)< zP) + p(—A"x)] = ~[p() + p(—x)]. 


Combining this with (26), we get 


g(x—Ax)< zp) + p(—x)]. (26’) 


Now we let n —> 00; since the right side of (26’) tends to 0, 


g(x — Ax) <0. l (27) 


Since g dominates £, we deduce from (27) that 


£(x — Ax) <0. 
Since £ is linear, this implies that for all x, 

£(x) < (Ax). (27') 
Replacing x by —x, we get 


£(—x) < £(—Ax), 


EXTENSIONS OF THE HAHN-BANACH THEOREM i 27 


which is the opposite of inequality (27’). So equality must hold, meaning that € is 
invariant under each A. 
By construction, £ is dominated by g. It follows then from (21’) that it is dominated 


by p- g 


Exercise 3. Show that theorem 7 remains true if condition (17) is replaced by 
p(Ax) < p(x). 


We conclude by a version of HB for complex linear space due to Bohnenblust and 
Sobczyk, and Soukhomlinoff: 


Theorem 8. Let X be a linear space over C, and p a real valued function that 
satisfies 


(i) 
p(ax) = |a| p(x) (28) 


for all complex a, all x in X; 
(ii) subadditivity, 


p(x + y) < p(x) + p(y). 


Let Y be a linear subspace of X over C, and let £ be a linear functional on Y that 
Satisfies 


(ls py)  foryinY. (29) 
Assertion. £ can be extended to all of X so that (29) holds over X. 


Proof. Split £ into its real and imaginary part: 


POTAE aaa a 


Clearly, 2; and £ are linear over R, and are related by 


2 (iy) = —£2(y). (31) 


= Conversely; if l isa linear functional over Rye re ses em 


(x) = 2 (x) — iy (ix) B1) 


is linear over C. 


We turm-now to-the task of-extending €-It follows-from_(29)-and_(3())-that———-.. -—- 


£i(y) < p(y). (32) 


aye 


28 i THE HAHN-BANACH THEOREM 


Therefore by the real H-B theorem, £1 can be extended to all of X so that (32) holds. 
We define £ on X by (31). Clearly, 2 is linear over C and we claim that (29) holds: - ~- 
To see this, write 


£(x)=ar, rreal, |al=1. 


Then 


TI BG) Sr = wey = bla" x) =O" !x) pae !x) = pa. 
This completes the proof of the complex H-B theorem. i) 


A historical review and a modern update is given by Gerard Buskés in his survey ` 
article. 


BIBLIOGRAPHY 


Agnew, R. P. and Morse, A. P. Extension of linear functionals, with application to limits, integrals, mea- 
sures, and densities, An. Math., 39 (1938): 20-30. 


Banach, S. Sur les fonctionelles linéaires. Studia Math., 1 (1929): 211-216, 223-229. 


Bohnenblust, H. F. and Sobczyk, A. Extension of functionals on complex linear spaces. Bull. AMS, 44 
_.. (1938): 91-93. 


Buskes, G. The Hahn-Banach Theorem Surveyed. Dissertationes Mathematicae, 327. 1993. 


Hahn, H. Uber lineare Gleichungssysteme in linearen Räumen. J. Reine Angew. Math., 157 (1927): 214— 
229. 


Soukhomlinoff, G. A. Über Fortsetzung von linearen Funktionalen in linearen komplexen Räumen und 
linearen Quaternion-räumen. Sbornik, N.S., 3 (1938): 353-358. 


á 


APPLICATIONS OF THE 
HAHN-BANACH THEOREM 


4.1 EXTENSION OF POSITIVE LINEAR FUNCTIONALS 


S denotes any abstract set, and B = B(S) the collection of all real-valued functions 
x on S that are bounded, that is, satisfy 


|x(s)| < c. (1) 


B is a linear space over the reals. 

There is a natural partial order for the elements of B : x < y means that x(s) < 
y(s) for all s in S. A function x satisfying 0 < x is called nonnegative. 

Let Y be a linear subspace of B that contains some nonnegative functions. A 
linear functional £ defined on Y is called positive on Y if £(y) > 0 for all nonnegative 
y in Y. Every positive linear functional £ is monotone: 


yı <y2 implies £(y1ı) < (y2). (2) 


Theorem 1. Let Y be a linear subspace of B that contains a function yo greater 
than some positive constant, say 1: 


l<yo(s) forallsinS. (3) 


Let £ be a positive linear functional defined on Y. 


Proof. We define the function p on B as follows: for any x in B, 


p(x) =inff(y), ox Sys yin. (4) 


This-function-p is well-defined;-for-it folows-from ¢h-and-(3) that === == - 


—cy0 SX Scyo, (5) 


29 


ne 


aa 


30 ; APPLICATIONS OF THE HAHN-BANACH THEOREM 


which shows that the inf in (4) is over a nonempty set, and that p(x) < c£(yo) where 

c is any constant satisfying (1). The smallest such constant is € = sup,-jn-s-(s)}.-—. 
It follows from (5) that any y > x satisfies —cyọ < x < y. Since £ is linear and 
positive, for such y it follows from (2) that —c£(yo) < £(y), and so by (4) 


—cl(yo) < p(x). (6) 


__ Lemma 2. The function p defined by (4) is 


(i) positive homogeneous. 

(ii) subadditive. 
(ill) negative: p(x) < Oforx <0. 
(iv) p(x) = £(x) for x in Y. 


Proof. 


(i) It follows from the definition that x < y implies ax < ay, a > 0. Positive 
homogeneity follows from definition (4). 

(ii) Let x) and x9 be any two functions in B, y; and yo any two functions in Y 
satisfying 


X% Sy, %2< yo. 
Adding the two we obtain x + x2 < y1 + y2; so by definition (4) of p, 
p(x +x) = inf L(y) < inf £6 +2) 
Xitay xS 


aan (7) 
= inf £(yı)+ inf £02) = p1) + pa) l 
Xi sy x25 2 


This proves subadditivity. 

(iii) Suppose that x < 0; then y = 0 is admissible in the inf on the right in (4), 
giving p(x) < £(0) = 0, as asserted in (iii). 

(iv) Suppose that x belongs to Y; then by (2), x < y implies £(x) < (y), equality 
holding for y = x. Setting this into (4) gives p(x) = £(x), as asserted in (iv). 


D 


It follows from lemma 2 that we can apply the Hahn-Banach theorem to extend £ 
from ¥ to all of B so that £ remains dominated by p: 


£(x) < p(x). (8) 
Suppose that x is nonpositive. Then by (iii), p(x) < 0, so by (8), 
£(x) <0 for x < 0. (9) 


This shows that £ is positive, as asserted in theorem 1. 0 


oo ae oo t5) 


BANACH LIMITS 31 


Theorem 1 is a special case of a very general theorem of Mark Krein; see p. 20 of 
Kelley and Namioka. 


4.2 BANACH LIMITS 
B denotes the space of bounded infinite sequences x of real numbers, 


x = (ai, 42,...). (10) 


B is a linear space over the reals when vector addition and multiplication by a scalar 
are defined componentwise. We define the function p on B as follows: 


p(x) = limsupap, (11) 


n= 


where x is given by (10). It follows from this definition that p is a positive homoge- 
neous function of x; we leave it as an textitexercise to the reader to prove that pis 
subadditive. 

Define A as left translation, that is, 


Ax = (a2, 43,...). (12) 


It is an immediate consequence of definition (11) that p is translation invariant, 
namely that 


p(Ax) = p(x). (13) 


We define Y as the space of convergent sequences of real numbers. Clearly, Y is 
a linear subspace of B. On Y, we define the linear functional £ by 


g (9) = lim bn, (14) 
where 
< y= brbn H 


Clearly, £ is linear. Comparing definitions (11) and (14), we conclude that 


Clearly, Y is mapped into itself by translation; equally clearly, £ is invariant on Y 
under translation: : 


£(Ay) = £(y) for y in Y. (16) 


We apply now theorem 7 in chapter < 3 to conclude that £ can be extended to all. 
bounded sequences x in B so that 


32 APPLICATIONS OF THE HAHN-BANACH THEOREM 


(i) 2 is linear 
(ii) @ is invariant under translation 
(iii) Zis dominated by p. 


Theorem 3. To each bounded sequence (10) we can assign a generalized limit (or 
“Banach limit), denoted as 


so that 


(i) For convergent sequences the generalized limit agrees with the usual limit. 
(it) 
LIM (an + bn) = LIM an + LIM by. 
n->0O n= n> 00 
(iii) For any k 


LIM ank = LIM an. 
n—00 n> 00 


(tv) 


liminfa, < LIM a, < limsupay. 
noo n->0o 


Proof. We set, in the notation of (10), 


LIM a, = £(x). 
h09 


Part (i) follows from (14), (14°); part (ii) expresses the linearity of £; part (iii) is the 
translation invariance of £. Part (iv) expresses the domination of £ by p, as defined 
by (11), and applied to £(x) and £(—x): 


—p(—x) < £(x) < px). 0 


Exercise I. Show that if in section 4.1 we take S = {positive integers}, Y the space 
of convergent sequences, £ defined by (14), the function p given by (4) is the same 
as defined by (11). 


Exercise 2. Show that a Banach limit can be so chosen that for any bounded se- 
quence (c1, C2, ...) that is Cesaro summable; namely the arithmetic means of the 
partial sums converge to c, 


LIM Cn =C. 


n> 00 


Exercise 3. Show that a generalized limit as t —> co can be assigned to all bounded 
functions x(t) defined on f > 0 that has properties (i) to (iv) in theorem 3. 


FINITELY ADDITIVE INVARIANT SET FUNCTIONS 3 
4.3 FINITELY ADDITIVE INVARIANT SET FUNCTIONS 


The Lebesgue measure on the unit circle is invariant under rotation. This measure can 
be extended to a considerably larger o -algebra than the Lebesgue measurable sets on 
the unit circle so that rotational invariance is retained. However it is well known, 
and easy to show, that if we accept the axiom of choice, then there is no rotationally 
invariant countably additive measure defined for all subsets of the circle. We show 
now 


Theorem 4. One can define a nonnegative finitely additive set function m(P), for 
all subsets P of the circle, that is invariant under rotation. 


Proof. We take S to be the unit circle, and B the set of all bounded real-valued 
functions on S. We take Y to be the space of bounded, Lebesgue measurable func- 
tions on S, and take £(y) to be the Lebesgue integral of y: 


L(y) = foa. (17) 


The space Y contains the function yọ = 1, so condition (3) of theorem 1 of sec- 
tion 4.1 is fulfilled. Therefore the function p described there by equation (4) is well 
defined. 

We denote by {Ap} the action on function of rotations p of the circle. As remarked 
above, £ is invariant under rotation: 


(Apy)(@)=y@+p), l(Apy) = £y). (18) 


Since the relation x < y also is invariant under rotation, it follows that p as defined 
by (4) is rotation invariant: 


P(Apx) = p(x). (18’) 


Rotations of the circle commute, and so the linear maps {Ap} form a commuting 
group of maps. We apply now theorem 7 of chapter 3 to conclude that £ can be 
extended to all of B.so that £ is _ 


(i) linear. 
(it) invariant under rotation. 


1 if @isinP 


cp (0) = 0 otherwise. 


(19) 


~- We-define-the-setfunctiomm-by-setting; Sar 


m(P) = €(cp). (19’) 


34 APPLICATIONS OF THE HAHN-BANACH THEOREM 


~ As shown in theorem 1, it follows from £(x) < p(x) that £ is positive. Since cp is a 
nonnegative function, it follows from definition (19’) of m that m is nonnegative: 


m(P) > 0. 


` Let ø be aiiy rotation: denote'thé’set' P rotated by p as P + p. It follows from the 
one seGefinition-<9)-of-cp-that---.. s- a 


Opn =ApCP. ` 20 


Since £ is rotation invariant, it follows from the definition (19°) of m that 
m(P +p) =m(P), 


meaning that m is rotationally invariant. 
Let P; and P, be disjoint subsets. Then, by definition (19), 


CP\UP, = CP; + CP. 
Setting this into the definition (19) of m, and using the linearity of £, we deduce that 
m(P, U P2) = m(P}) + m(P9). 
This proves that m is finitely additive. 0O 


NOTE. Rotations of the circle commute with each other, and so the operators Ap 
commute; this was needed in invoking theorem 7 of chapter 3. Rotations of the three- 
dimensional sphere do not commute, and neither do the corresponding operators Ap. 
Therefore the above proof cannot be used to extend theorem 4 to three dimensions. 
In fact Hausdorff has shown that the three-dimensional analogue of theorem 4 is 
false; there is no rotational invariant, finitely additive set function on the 2-sphere. 
The proof is based on a finite decomposition of the 2-sphere, sometimes called the 
Banach-Tarski paradox. 


In conclusion, we point out that the duality theory of Banach spaces constitues the 
richest applications of the Hahn-Banach theorem. These are described in chapters 8 
and 9. 


HISTORICAL NOTE. His name is etched into the foundations of modern analy- 
sis: Hausdorff space, Hausdorff maximality principle, and Hausdorff measures are 
household concepts. He was a German mathematician, born in 1868; as a young man 
he published several volumes of poetry and aphorisms. He spent most of his pro- 
fessional life as professor in Bonn. Because he was Jewish, in 1942 he was ordered 
deported, part of the “Final Solution” to kill all the Jews in Europe. Knowing what 
awaited them, Hausdorff, his wife, and sister-in-law committed suicide. 


BIBLIOGRAPHY 35 


BIBLIOGRAPHY 


Hausdorff, F. Grundziige der Mengenlehre. Verlag von Veit, Leipzig, 1914. Reprinted by Chelsea Pub- 
lishing, New York. 


Kelley, J. L. and Namioka, I. Linear Topological Spaces. Van Nostrand, Princeton, NJ, 1963. 


5 


___..NORMED LINEAR SPACES 


i 


5.1 NORMS 


Let X denote a linear space over R or C. A norm in X is a real-valued function: 
X — R, denoted as |x|, with the following properties: 


(i) Positivity, 
Ix] >0 for x Æ 0; |O| = 0. (1) 
(ii) Subadditivity, 
lx + y| < |x] + [yl]. (2) 
Gii) Homogeneity. For all scalars a, 
jax| = |a| |x]. (3) 


With the aid of a norm we can introduce a metric in X, by defining the distance 
“of two points to be nd a E 


d(x, y) = |x — yl. (4) 


It is easy to verify that this has all properties of a metric. Conversely, it is easy to show 
that every metric in a linear space that is translation invariant and homogeneous: 


= d(x+z,y+z)=d(x,y), d(ax,ay) = ja] d(x, y) (4’) 
comes from a norm via (4). \ 


With a metric (4) we can employ topological notions such as convergent series, 
open sets, closed sets, and compact sets. Those notions turn out to be crucial. 


36 


NORMS - 39 

Definition. Two different norms, |x|, and |x|2, defined on the same space X are 
called equivalent if there is a constant c such that 
-I 

clxh < |xlo Se [xh (5) 


for all x in X. 


The significance of this notion is that equivalent norms induce the same topology. 

In chapter 1 we looked at various ways of building new linear spaces; the same 
constructions can be used to build new normed linear spaces. Specifically we ob- 
served the following: 


(i) A subspace Y of a normed linear space X is again a normed linear space. 

(ii) Given two linear spaces Z and U, their Cartesian product, denoted as a direct 
sum Z @ U, consists of all ordered pairs (z, u), z € Z, u € U. When Z and U 
are normed, Z @ U can be normed, such as by setting 


Kzu) = lzl+ul, (z, W) = max{|z}, jul}, or (Cz, W)” = Qz u! 
(6) 


Exercise 1. 


(a) Show that (6) are norms. 
(b) Show that they are equivalent norms in the sense of (5). 


Let X be a normed linear space, Y a subspace. We saw in chapter 1 that we can define 
their quotient X/Y as a linear space. We raise now the question: is there a natural 
way to introduce a norm in the quotient space? The answer is yes, provided that Y is 
closed: 


Theorem 1. Let Y be a closed subspace of a normed linear space X. Let {x;} be an 
equivalence class of elements of X mod Y. We define 


I{xj}| = inf [xjl (7) 
j ge 
Assertion. (7) has all properties of a norm in the quotient space X/Y. 


Proof—Property-(3),-homogeneity,-holds-trivially..To-verify subadditivity,-let-{xj}-— -—-—- 
and {zj} denote two equivalence classes. For any € > 0 we can, by definition (7), 
choose representatives so that 


Ixjl<Mxte lel < Hz + e. (8) 


By definition of addition in X/Y, xj + zj belongs to {x;} + {zj}; therefore, by 
~ definition (7), 


{xj} + {zj} < ley +zyl, 


7D 


38 NORMED LINEAR SPACES 
which by subadditivity in X, and (8), is 

< lay] + izj} < {xj} + {zj} + 2e. 
Since this is true for all € > 0, subadditivity of the norm (7) follows. 


_ Clearly, (7) is nonnegative. To show positivity, suppose that |{x;}| = 0. By defi- 
nition O, there i is a sequence | of elements Xn in {x j} such that : 


lim |xn| = 0. (9) 
n-> OO 


By definition of equivalence, the equivalent elements xn differ oe each other by 


elements that belong to Y. In particular, we can write 
Xn = X] — Yn, n=2,3,...,  yninY. 
Setting this into (9), we see that 
ieee = Jal =O, 
which by (4) means in the language of metric spaces that 
tinct kp 9") 


In a metric space, the limit of a sequence of elements in a subset Y belongs to the 
closure of Y. Now, since Y is assumed to be closed, (9’) implies that x; belongs to 
Y. But then the whole equivalence class {xj} consists of elements of Y, which is the 
zero element in X/Y. O 


Theorem 2. Let X be a normed linear space, Y a subspace of X. The closure of Y 


is a linear subspace of X. 


Exercise 2. Prove theorem 2. 


For purposes of analysis, in the construction of objects with desirable properties 
through limiting processes, we need metric spaces that are complete in the sense that 
every Cauchy sequence has a limit. So it is with normed linear spaces: 


Definition. A Banach space is a normed linear space that is complete. 
We recall the process of completion of a metric space whereby any metric space 
S is embedded in a complete metric space denoted as S, consisting of equivalence 


classes of Cauchy sequences. S is a dense subset of S, i.e. the closure of S is S. 


Theorem 3. The completion X of a normed linear space X under the metric (4) has 
a natural linear structure that makes X a complete normed linear space. 


NORMS 39 


Proof. Recall that the points of the completion of a metric space are equivalence 
classes of Cauchy sequences. The term-by-term sum of two Cauchy sequences is 
again a Cauchy sequence, and sums of equivalent Cauchy sequences are equivalent. 

Oo 


Exercise 3. Show that if X is Banach space, Y a closed subspace of X, the quotient 
space X/Y is complete. (Hint: Use a Cauchy sequence {qn} in X/Y that satisfies 
lan — dnl < 1/n®,) 


The process of completion of a normed linear space is one of the royal roads to 
obtaining complete normed linear spaces. This is extremely important for the success 
of functional analysis. We describe now a number of the most important normed 
linear spaces. These are the household items of modern analysis. 


(a) The space of all vectors with infinite number of components 
x = {a1,42,...}, a; complex, 
where the |a;| are bounded. The norm is 


Ixloo = sup|ajl. (10) 
j 
This space is denoted as 2™; it is complete. 


(b) The space of all vectors with infinitely many components such that }_ |a;|? < 
oo, p some fixed number > 1. The norm is 


Ixlp = 62 ay?) (11) 


This space is denoted as £?; it is complete. 


(c) S an abstract set, X the space of all complex-valued functions f that are 
bounded. The norm is 


flo = sup FL (12) 


This space is complete. 


(d) Q atopological space, X the space of all complex valued, continuous, bounded 
functions f on Q. The norm is l 


Ifi TOPIE (13) 


This space is complete. 


Fo 


40 NORMED LINEAR SPACES 


(e) Q a topological space, X the space of all complex-valued, continuous func- 
tions f with compact support-The-norm-is——— — ——— ~~~ 


|f Imax = max |f 9). (13 


This space is not complete unless Q is compact. 
(f) D'some domain in R”, X the-space-of-continuous functions-f*withcompact="— 


one SUPPOrt-—Fhe-norm-is z 


1/p l 

f= (f Ia) o sp, (14) 
This space is not complete; its completion is denoted by LP. 

(g) D some domain in R”, the space of all C% functions f in D with the following 
property: for some integer k and p > 1, 


f l3% f\? dx < œ for all jæ] < k, 
D . 
where 0° is any partial derivative: 


a 
a = ee a, j= agi lal = oy +- +a. 
The norm is 
1/p 
le =[ >. fofa) (15) 
la|sk 

This space is not complete; its completion is denoted as W*:?, and is called a 
Sobolev space. a8 eer er 2 a ee Ce Gaede abe Geri de 


Theorem 4. The norms defined in examples (a) through (g) have properties (1) 
through (3) imposed on a norm. 


Proof. Properties (1) and (3)—positivity and homogeneity—are obviously satis- 
fied. We turn now to property (2), subadditivity. For the sake of brevity we consider 
only examples (a) and (b). Note that (a) can be regarded as a limiting case of (b), 
with p = oo. 

Define x and y as 


x = {aj,a9,...}, y= {by, bo,...}. 
Then 


“x+y = {a;+),,...} 


NORMS i 41 
We take first p = œ. By (10), 
|x + Yloo = sup |aj + bj| < sup|aj| + lbj| 
j j 


< sup |aj| + sup |bj| = |xloc + lyloo- 
J J 


Next we turn to p = 1. By (11), 


l+ yh = laj +bjl < Yo lajl+ lbj] = lxh +yli 


For 1 < p < œo we need Hölder’s inequality. To state it, we introduce vectors u 
with finite q-norm: 


1/q 
wetara] (Zien) = |ulg < œ, (16) 
where q is conjugate to p, in the sense 


di 
+- =l. (17) 
P q 


We define now a scalar product between vectors in £? and £7 as follows: 
(x, u) = > ajej. (18) 


Hölder’s Inequality. For x in £7, u in £1 the series defining the scalar product (18) 
converges, and 


Ice, W| < Ixlp lela, (19) 
provided that p and q are conjugate in the sense of (17). 
For a proof we refer to Courant’s Calculus, Vol 2. The sign of equality holds in 
(19) iff ` 


argajcj and |a;|’/|cj|? are independent of j. (20) 


Since for given x in £? we can always choose u in £1 so that (20) is satisfied, and 
so that |u]g = 1, we can restate Hölder’s inequality thus: TTT TT 


Theorem 5. For any x in 8P, 


xp = Aras |Œ, u)|. (21) 


- Note that the scalar product (18) is bilinear as function of x and u. Applying (21) 
to x + y in place of x and using the linear dependence, we get 


42 NORMED LINEAR SPACES 


Ix + ylp = max |x + y,u)| < max |x, W| + |Q, 4)I- (22) 
\ulg=I lu lg=1 . 


By Hélder’s inequality (19), for julg = 1, 


lœ u) <lxlp, 10w) < ly lp. 


Setting this into (22) gives 


E E E na P Sia a | tp ae Irt 


as asserted in theorem 4. o 


The self-conjugate case p = q = 2-is an instance-of a supremely important-class-—— 
of norms, to be discussed in the next chapter. 

The norms defined in examples (f) and (g) satisfy important inequalities due to 
Sobolev: If 


np 
n—kp 


mp<n and p<q< (23) 


and if Q is a cube, then 


Iflg < const. |flx, p, (23’) 


where the constant depends only on p, q, k, n. These inequalities hold of course for 
all Q that are the images of cubes under a smooth mapping. Even more generally, 
they hold for all domains Q that satisfy a cone condition. For a proof, see Adams or 
Mazya. 

Since the spaces L7 and W*+P are constructed by completing the space of smooth 
functions in the appropriate norms, it follows that if condition (23) is fulfilled, W™? 
is contained in L4. 

__ The normed linear spaces studied and used in analysis are infinite-dimensional. 
According to Cantor’s theory of sets, there is a gradation among infinites; the least 
of them are the countable sets. 


Definition. A normed linear space is called separable if it contains a countable set 
of points that is dense, namely, whose closure is the whole space. 


Most, but not all, spaces that are used in analysis are separable. Here is an impor- 
tant example that is not: 


(h) The space of all signed measures m on, say, the interval [0, 1], of finite total 
mass. We define the norm to be the total mass: 


l 
m= f |dm|. 
0 


NONCOMPACTNESS OF THE UNIT BALL 43 


Denote by my the unit mass located at the point y. Clearly, for y 4 z, ny -m| = 
2. Since there are nondenumerably many points y in the interval [0, 1], this shows 
that the space of measures is not separable. 


5.2 NONCOMPACTNESS OF THE UNIT BALL 


Many existence theorems in finite-dimensional spaces rest on the fact that the closed 
unit ball, meaning that the set of points 


By = (x; |x| <1}, (24) 


is compact, that is to say, that any sequence of points in Bı has a convergent subse- 
quence. F. Riesz has shown that this property characterizes finite-dimensional spaces: 


Theorem 6. Let X be an infinite-dimensional normed linear space; then the unit 
ball B, defined by (24) is not compact. , 


Proof. We require first a lemma: 


Lemma 7. Let Y be a closed, proper subspace of the normed linear space X. Then 
there is a vector z in X of length 1, 


izi= 1, (25) 
and that satisfies 
Iz—yl> l forall yinY. (25°) 


Proof. Since Y is a proper subspace of X, some point x of X does not belong to Y. 
Since Y is closed, x has a positive distance to Y: 


inf jx —y|=d>0. 
pee = (26) 


There is then a yg in Y such that 


id gl Id SOT, ee 


“Denote z’ = x — yo; we can then write (27) as 
D 4 h (27) 
iz] < 2d. (27') 


It follows from (26) that 


iz’ —y|>d forall yin Y. (28) 


44 NORMED LINEAR SPACES 


We set 


‘Clearly, (25) holds, and (25’) follows from combining (27) and (28). o 


REMARK 1. Clearly, the number 4 on the right of (25’) can be replaced by any 


number < 1. 


We turn now to the proof of theorem 6. We construct a sequence {yn} of unit 


- vectors recursively as follows: y; is chosen arbitrarily. Suppose that y1, ..., yn—1 


have been chosen; denote by Y;,, the linear space spanned by them. Since Y, is finite- 
dimensional, it is closed; since X is infinite-dimensional, Y, is a proper subspace of 
X. So lemma 7 is applicable and a z with properties (25), (25’) exists. We set 


Yn =Z. 


Since yj, j < n belongs to Yn 
1 p 
lyn — yl > 3, j<n. 


This shows that the distance of any two distinct y; exceeds 5. Therefore no sub- 
sequence can form a Cauchy sequence. Since all y; belong to the unit ball By, it 
follows that B4 is not compact. o 


Exercise 4. Prove that every finite-dimensional subspace of a normed linear space is 
closed. (Hint: Use the fact that all norms are equivalent on finite-dimensional spaces 
to show that every finite-dimensional subspace is complete.) 


Next we describe a kind of a substitute for the compactness that is lacking in the 
unit ball. 


Definition. A norm is called strictly subadditive if in (2) strict inequality holds ex- 
cept when x or y is a nonnegative multiple of the other. 


Exercise 5. Show that the sup norms of examples (a), (c), (d), and (e) are not strictly 
subadditive. i 


Exercise 6. Show that the norms in examples (b) and (f) are not strictly subadditive 
for p = 1. 


All the norms in examples (b) and (f) are strictly subadditive when 1 < p < oo. 
Furthermore for each of these norms the condition holds uniformly, in the following 
sense: 


NONCOMPACTNESS OF THE UNIT BALL 45 


For any pair of unit vectors x, y, the norm of (x + y)/2 is strictly less than 1 by an 
amount that depends only on |x — y|. More explicitly, there is an increasing function 
e(r) defined for positive r, 

e(r)>0, lime(r) =0, (29) 
r—+0 


such that for all x, y in the unit ball |x| < 1, |y] < 1, the inequality 


oo <1-e(\x—y)). (30) 


holds. 


Definition. A normed linear space whose norm satisfies (30) for all vectors x, y of 
unit length, where e(r) is some function satisfying (29), is called uniformly convex. 


Theorem 8. Let X be a uniformly convex Banach space. Let K bea closed, convex 
subset of X, z any point of X. Then there is a unique point y of K which is closer to 


z than any other point of K. 


Proof. We may take z = 0, provided that we assume that 0 does not lie in K. 
Denote by s the distance of 0 to K, that is, 


s = infly|, yin K. (31) 


Since 0 does not lie in K, and since K is closed, s > 0. Let {yn} be a minimizing 
sequence for (31), that is, 


yn in K, |ynl] =Sn >s. (31) 


Define the unit vectors xn as 


Xp = aei dataene te) 


eee ok = 2 os 
we can write 
Xn + Xm 1 y 
Taa a a re MOTE eas mta m B® EE 
l 1 
= (=+ + =) (CnYn + CmYm). (32) 
£5n 25m 


Clearly, cn and cm are positive, and ĉn + cm = 1. Since K is convex, it follows that 


- Cr Yn Cm ¥m-belongs-to.KTherefore, by_G.L), 3; = 


\CnYn + mYml = S. 


46 NORMED LINEAR SPACES 
Setting this into (32), we get that 


Xn + Xm 


S S 
Dets Eo 33 
7 T (33) 


~ 287 25m 


Since-{-y}-is-a-minimizing-sequence-for (31), Sn —> s; therefore the right side of 


(33).tends to 1. Soit follows from (33), (30), and (29) that limp m—oo |%n —Xm| = 0. 


The power of theorem 8 lies in the fact that it asserts the existence of a minimum, 
when the set K over which we wish to minimimze is not compact; according to 
theorem 6, a Banach space has many closed, bounded sets K that are not compact. 

The notion of uniform convexity is due to Clarkson, as is the result that the L? 
spaces are uniformly convex for 1 < p < oo. 

We give now an example that shows that in the space C, which is not uniformly 
convex—indeed, the maximum norm is not even strictly subadditive—the conclusion 
of theorem 8 fails. 

We take X to be C[—1, 1], the space of continuous real-valued functions defined 
on the closed interval —1 < t < 1. We take K to consist of all functions k(t} that 
satisfy 


0 l 
| kdt =0, j kdt =0. (34) 
-1 0 


K is a linear subspace and therefore convex, and clearly closed. 
We take for z(t) any function in C for which 


It follows from (34) that for any k in K 


0 1 
i (z-k)dt=1, f e-Đa=-. 
-1 0 


From this it follows that 
max [z(t)—k(t)] > 1, (35) 
—1<1<0 
equality holding iff 


z(t) — k(t) =1 for —~1 <tr <0, (35’) 


ISOMETRIES z l 47 


and similarly that 


in [z(t)—k < —l, 
pee (O-k@)] s=1 (36) 
equality holding iff 
z(t) -—k(t) =—-1 forO<t <1. (36’) 


Conditions (35’) and (36’) cannot both hold at t = 0, so it follows that in at least one 
of (35) or (36) inequality holds. This proves that 


lz —klmax > | (37) 


for any k in K. On the other hand, one could choose k in K so that the max and min 
in (35) and (36) are as close to 1 and —1, respectively, as one wishes. So 


inf |z—k =|. ! 
pnfi |max BT) 


This combined with (37) shows that there is no closest point to z in K. o 


5.3 ISOMETRIES 


We turn now to isometries of a Banach space X onto itself, meaning mappings M of 
X onto X which preserve the distance of any pair of points: 


IM(x) — M(y)] = [x — y| for all x, y in X. (38) 


Clearly, translations M(x) = x + u, u fixed, are isometries. Also the isometries of 
X form a group. We want to investigate those isometries that map 0 into 0; all others 
can be obtained by composing these with a translation. 


Theorem 9. Let X be a linear space over the reals with a strictly subadditive norm. 
Let M be an isometric mapping of X into itself that maps the origin into itself. Then 


~-Mris-linear. ~ 


Proof. Denote for simplicity M(x) by x’. Take any pair of points x and y and 


define 
a z= 5^ (39) 
Using isometry as stated in (38), and the definition of z, we have 
x— 
oo w= el E, 
i -yl=k-yl= A (40) 


2 ’ 


<b 
bad 


48 NORMED LINEAR SPACES 


and 


|x’ — y'| = |x — yl. (40') 


These imply that — 


O (XO! ele de ee |x ee tz! ae 


poe Joe 


Since the norm is strictly subadditive, x’ — z’ and z' — y must be positive multiples 
of each other. Since by (40) they have the same norm, they must be equal: x’ =z" = 
wf t 

z — y. Hence 


2 =x +y", cs) 
o 

Exercise 7. Deduce from (41) that Mis linear. 
It is a fact of life that some Banach spaces are very rich in isometries; others are 


very poor. Among the rich ones are the Hilbert spaces discussed in chapter 6; among 
the poor ones are function spaces with the max norm. Here is an example due to 


EEA nee by X the space of null sequences of complex-numbers 
x = {ay}, im, an =0 (42) 
normed by 
x] = max janl. - (42') 


“Exercise 8. Show that X is complete. 


Let {bn} be an arbitrary sequence of complex numbers of absolute value 1: 
lbn] = 1. Define the mapping Uby 


Ux = {bnan}. (43) 


Clearly, U is a linear map of X onto X, and satisfies |Ux| = |x|; thus U is an isometry. 
Let p be a permutation of the positive integers. Define the map P by 


Px = {an}, an = p(n): (44) 
Clearly, P is a linear map of X onto X, and an isometry. 


Theorem 10. Every linear isometry of the Banach space X defined by (42), (42') is 
the composite of an isometry of type (43) and (44). 


ISOMETRIES 49 


Proof. Let uj; be a jth unit vector, i.e. a vector whose jth component has absolute 
value 1, all others are zero. Denote by 7; the linear subspace of X consisting of all 
vectors whose jth component is zero. Clearly, 


Tj is closed and codim T} = 1, (45) 
Juj +t] =1 for allt in T}, |t| < 1. (46) 

Conversely, we have 
Lemma 11. Let u be a vector in X, |u| = 1, and T a subspace of X of codimen- 


sion I so that (45) and (46) hold. Then u is a unit vector and T the corresponding 
subspace Tj. 


Proof, By definition (42’) of the norm 
1 = ful = jum] for some index m. 


It follows from (46) that no vector ¢ in T can have an mth component Æ 0. Since 
T is assumed to have codimension 1, it follows that T consists of all null sequences 


whose mth component is zero. From this it follows, by (46), that all components of 


u other than the mth must be zero. g 


Let M be a linear isometry. of X onto X; let uj be any unit vector and 7; the 
corresponding subspace. Since M is linear, isometric, and onto, it follows that ul, 
and T; , the image of u j and T; under M, satisfy (45) and (46). Then by Lemma 11, 
u’ is a unit vector; from this and the linearity of isometry theorem 10 follows readily. 

We conclude this chapter with the following result due to Mazur and Ulam: 


Theorem 12. Let X and X’ be two normed linear spaces over ther reals, M an 
isometric mapping of X onto X' that carries 0 into 0. Then M is linear. 


Proof. The case where the norms are strictly subadditive is covered in theorem 9. 


In the general case we take, as before, x and y to be any pair of points, and z their 
midpoint: 


x+y z 
~ 


Zz = 


As before, in (40), z is halfway between x and y, but when the norm in X is not 
strictly subadditive, this no longer characterizes the midpoint z. There may be other 
points u also halfway between x and y: 


je =al = ly = ul = E, (47) 


rya 


50. NORMED LINEAR SPACES 


We denote the set of all such u by A. We claim that this set Ais symmetric with 
respect to the midpoint z. That is, that if u belongs to A, then so does 


vp=2z-—Uu. (48) 


To see this, we note that 2z = x + y, and:so 


ee anta tao ea v=x =y—u, and v-y=x-u. 


It follows from (47) that v is halfway between x and y. =” 


We define the diameter d, of A as the greatest distance between pairs of points 
of A: - 


da= sup |u- wl. (49) 


u,w in A 
Since A is symmetric with respect to z, for all u in A, 
lu — z| < 5d A- 
Of course, there may be other points p in A with this property: 
lu-pl<4d,q  foralluin A. (50) 


We denote the set of all such p by A}. We claim that A; is symmetric with respect 
to the midpoint z. That is, if p belongs to Aj, so does 


q =2z— p. (51) 
For using (48), we can write for any u in A, 


g-u=2z-u-—p=v-—p. (51') 


“We conclude from (51’) that |g — u| = |v — pl. Since v belongs to A when u does, 


it follows from (50) that |u — q| < 3da. 
It follows from (50) that the diameter of A; does not exceed half the diameter 
of A: 


da, < $da. (52) 


We now repeat this construction, obtaining a nested sequence of sets A D A; D 
Aj ++, each containing the midpoint z, each symmetric with respect to z, and their 
diameters satisfying 


l 
dAn = 7da,,- 


Clearly, da, tends to zero; it follows that the intersection of all the sets An consist 
of the single point z. This characterizes the midpoint z of x, y purely in terms of the 
metric structure of X. 


BIBLIOGRAPHY 51 


Let M be an isometric mapping of X onto X’. Then the inverse of M maps X’ 
isometrically onto X. Denote by x’ and y’ the images of x, y under M, and denote 
by A’, Ai, ..., Aj, the sets analogous .to the ones defined in X. Recall that the set A 
was defined by (47), and that the set A; was defined by (49) and (50). Since these 
inequalities refer only to distances, and since M is an isometry, it follows that M 
maps every point of A, into A}. Since the inverse of M is isometric, it maps every 
point of A’, into Ay. Thus M maps A, onto A}, and thus the intersection of the A, 
onto the intersection of the A}. Since these intersections are, respectively, (x + y)/2 
and (x’ + y’)/2, it follows that 


xty\ x+y’ 
m( > )- ae (53) 


Setting y = 0, and using the assumption that M(0) = y’ = 0, we get that 
M(x/2) = x'/2. Applying this to equation (53), we get 


M(x + y) =x +y = M(x) + M(y). 


This is the first property of linearity, see equation (1) in chapter 2. From this we de- 
duce that M(kx) = kM(x) for all rational k. Since M is an isometry, it is continuous, 
and so the relation holds for all real k. g 


BIBLIOGRAPHY 


Adams, R. A. Sobolev Spaces. Academic Press, New York, 1975. . 
Clarkson, J. A. Uniformly convex spaces. Trans. AMS, 40 (1936): 396—414. 
Day, M. M. Normed Linear Spaces. Springer Verlag, 1958. 


Mazur, S. and Ulam, S. Sur les transformation isométriques d'espace vectoriel normés, C.R. Acad. Sci, 
__ Paris, 194 (1932): 946-948. 


Mazya, V. Sobolev Spaces. Springer Verlag, 1985. 


tf 


"HILBERT SPACE 


6.1 SCALAR PRODUCT 


A scalar product in a linear space X over R is a real valued function of two points x 
and y in X, denoted as (x, y), having the following properties: 


Gi) Bilinearity. For fixed y, (x, y) is a linear function of x, for fixed x a linear 
function of y. 


Gi) Symmetry, (y, x) = (x, Y). 
Gii) Positivity, (x, x) > 0 for x Æ 0. 


When the field of scalars is C, (x, y) is complex valued, and properties (i) and 
(ii) are altered as follows: 


(i) Sesquilinearity. For fixed y, (1) is a linear function of x and for x fixed (1) isa 
skewlinear function of y, that is, 


(ax, y) = a(x, Wi (x, ay) = a(x, y). (1) 
Gi) Skew symmetry, o o 
Ox) = Œ, y). (2) 
Given a scalar product, we can define a norm, denoted by || ||, as follows: 
lx = (x, x)". (3) 


We claim that || || has the obvious properties of a norm: 
Positivity follows from (iii), and homogeneity from (1). To show subadditivity we 
need 


Theorem 1 (Schwarz Inequality). A scalar product satisfying (i), (ii), and (iii) sat- 
isfies 


52 


SCALAR PRODUCT ; 53 
I(x, y) < lel iyl, (4) 
where the norm is defined by (3). Equality holds for x = ay or y = 0. 


Proof. Lett be a real scalar and y Æ 0. Using bilinearity and skew symmetry, we 
can write 


lx + tyl? = lix? + 2tRe(x, y) +e ILy I. (5) 
By (iii), this is nonnegative. Set £ = —Re(x, y)/Ily||? and multiply by || y||?. We get 
(Re(x, y)? < lix ly. 


Replacing x by ax, |a| = 1 so chosen that a(x, y) is real, we deduce (4). Note that 
equality holds in (4) iff x and y are scalar multiples of one another. O 


Corollary 1’. For every vector x in a scalar product space 


lxi = max |(x, y)l. 
Iyl=! 


Now we are ready to prove that the norm is subadditive. Set t = 1 in (5) and 
estimate the middle term by (4). We get 


lx + yi? < (xl + y7, 
which is subadditivity of the norm. (m 
We set t = +1 in (5) and add to obtain the parallelogram identity: 
et yf? Fie -yP AA + aE. (6) 


Exercise 1. Show that a norm that satisfies (6) comes rom a scalar product, an ob- 
servation due to von Neumann. = ey tds, 


Exercise 2. Show that the scalar product depends continuously on its factors; that is, 
if xn > x, Yn > y in the sense of ||xn = x|| > 0,° [lyn ~ yl} —> 0, then (xn, yn) 


Se a 


Definition. Wo vectors x and y are ealed orkokondi if a j = 0. 


Definition. A linear space with a scalar product that is complete with respect to the 
induced norm is called a Hilbert space. 


Given a linear space with a scalar product, it can be completed with respect to the 
norm derived from the scalar product. It follows from the Schwarz inequality that the 


54 HILBERT SPACE 


scalar product is a continuous function of its factors; therefore it can be extended to 
——----the-completed space. Thus the completion is a Hilbert space. 
We give some examples of linear spaces with inner product: 


Example 1. The space of continuous functions x(t) on the interval [0, 1], with 


1 
“ene f roya 
: 0 


This space js incomplete. 
sods Example 2. The space £? of vectors with infinitely many components: 
x = (aj,a2,...), ox (by, ba, ...) 
subject-to the restriction -- 
Xlaj? <o, > [bj? < 00. 
We define the scalar product as 
(x, y= > ajb;. 
Exercise 3. Show that £? is complete. 


Example 3. The space L? of all functions square integrable according to Lebesgue 
on some domain in R”. This space is complete. 


Many other examples will come up in the applications presented in subsequent 
chapters. 


6.2 CLOSEST POINT IN A CLOSED CONVEX SUBSET 


Theorem 2. Given a nonempty closed, convex subset K of a Hilbert space H, and 
a point x in H, there is a unique point y in K that is closer to x than any other point 


of K. 
Proof. Define 


inf |x -zl =d. (7) 


zin K 
Let y, in K be a minimizing sequence: 


limd, =d, dy = ||x — yrll. (8) 


CLOSEST POINT IN A CLOSED CONVEX SUBSET _ - 55 
We apply the parallelogram identity (6) to x = (x — yn)/2, y = (x — ym)/2: 


Yn + y, ae a 1 
2 e 

E EA 5 = | + gD” = Yml = 5da +da). (9) 
Since K is convex, (yn + ym)/2 belongs to K, and so by (7), |x — (yn + ym)/2l > d. 
Using this and (8) in (9), we deduce that y, is a Cauchy sequence. Since H is com- 
plete and K closed, y = lim yy belongs to K. Since ||x — y|| = lim |x — yn] = d, y 
minimizes the distance from x. That there is only one minimizer follows from (6): 
suppose that y’ is another minimum, and apply (6) to x — y, x — y’. Oo 


Theorem 2 is a special case of theorem 8 in chapter 5. 


Definition. Let Y be a linear subspace; its orthogonal complement consists of all 
vectors v orthogonal to Y, that is, satisfying (v, y) = 0. It is denoted as Y+. 


Theorem 3. Let H be a Hilbert space, Y a closed subspace of H, Y+ the orthogonal 
complement of Y. We claim that 


(i) Y+ is a closed linear subspace of H; 


(ii) Y and Y} are complementary subspaces, meaning that every x can be decom- 
posed uniquely as a sum of a vector in Y and in Y+; 


(iii) (Yt)4+ =Y. 
Proof. It follows from the bilinearity of scalar product that the set of vectors v 
orthogonal to all vectors of any set Y form a linear space. This shows that Yt isa 
linear space. Let {vj} be a convergent sequence of elements of Y 1; 


limvj =v. - (10) 


We claim that v belongs to Y+, namely that 


(v,z)=0 when z is in Y. 


Since vj belongs to yt, 


(UZ) (v = UFZ) F (Uj, z) = (v — vj, z). 
By the Schwarz inequality applied on the right, 


Iw, 2l < llv = vjl izl; l (11) 


by (10), |v — v;|] tends to zero. So (11) shows that (v, z) = 0, meaning that Y+ is 
closed, as asserted in (i). 


D 


56 HILBERT SPACE 


We turn now to (ii). Given any x in H, there i is, according to theorem 2, a vector y 
_in Y closest to x. Set — 


v=xX-y. (12) 


The minimum property of y means that for any z in Y and any realt; ommon = 


META 


Using (5), we can rewrite the right side as lul? +2t Re(v, z) +12 \Iz||? and conclude 
that 


Re(v, z) = 0 for all z in Y. (13) 


This shows that v belongs to ¥+; (12) gives the decomposition of x as y + v, the 
sum of a vector from Y and one from ye, 

This decomposition is unique, for if x = y + v = y’ + v’, then y — y =v’ —v 
would belong to both Y and ¥+ and thus would be orthogonal to itself. But then, 
by positivity, y — y’ = v’ — v’ = 0. Thus (ii) is proved. Part (iii) is an immediate 
consequence of (ii). 


COMMENT. It follows from theorem 3 that every closed linear subspace of a Hilbert 
space has a closed complement. This is not true for all Banach spaces; examples will 
be given later. 


6.3 LINEAR FUNCTIONALS 
A Hilbert space comes equipped with a whole set of built-in linear functionals. For 
y fixed, (x, y) =-€(¢)-is a linear functional of x,-that-is, alinear mapping of H 
into C. Furthermore, according to the Schwarz inequality (4), £(x) is bounded by a 
constant multiple of ||x||. It turns out that conversely: 
Theorem 4. Let £(x) be a linear functional on a Hilbert space H that is bounded: 
\£(x)| < const. ||x|]. (14) 

Then £ is of the form 

ta) = (x, y) yind. (15) 


The point y is uniquely determined. 


Proof. We will use the following facts: 


LINEAR FUNCTIONALS 57 
Lemma 5. 


(i) The nullspace of a linear functional that is not = 0 is a linear subspace of 
codimension 1. 


(ii) If two linear functionals £ and m have the same nullspace, they are constant 
multiples of each other: 


£=cm. (16) 


(iii) The nullspace of a linear functional that is bounded in the sense of (14) is a 
closed subspace. 


Exercise 4. Prove lemma 5. 


Note that lemma 5 holds in any Banach space, and parts (i) and (ii) in any linear 
space. 


Suppose now that £ is Æ 0. Then its nullspace is a closed subspace Y of H of 
codimension 1. Its orthogonal complement Y+ (see theorem 3) is one dimensional. 
Let p be any nonzero vector in y+, and define the linear functional m by 


m(x) = (x, p). 
Clearly, the nullspace of m is Y. So by (16) of part (ii) of lemma 5, 


£(x) =cm(x) = ae o 


Theorem 4 is called the Riesz-Frechet representation theorem. 
The following useful generalization has been given by Milgram and Lax: 


Theorem 6 (Lax-Milgram Lemma). Let H be a Hilbert space, and B(x, y) a 
function of two vectors with the following properties: 


(i) B(x, y) is for fixed y a linear eon of x, for fixed x a skewlinear function 
of y. 


(ii) B is bounded: there is a constant c so that for all x and y in H 
B(x, y) <ellxll lyi (17) 
tity Phereien positive constant she 


forall y in H. 


Assertion. Every linear "functional Lon H that i is bounded i in the sense of (14) is of 


IBO yz bD O 


the form SA a 


£(x) = B(x, y), y a uniquely determined vector in H. (19) 


58 HILBERT SPACE 


Proof. By (i) and (ii), for y fixed B(x, y) is a bounded ine functional of Xt 
Therefore by theorem 4 it can be writtén as 


B(x, y) = (x,z) zind. l a (20) 


Since z is uniquely determined by y, it is a function of y. It follows from (20) that 

the relation of z to y.is linear; it follows.from-this-that-the-set-of-g-appearing-in -20)-—-~——-. 
„~ as. y_takes_on all values in.H_is.alinearsubspace-of-H We claim that it is a closed — 
linear subspace. To see this, set x = y in (20): 


BO, y) = (y, 2). (20') 


Using (18) on the left and the Schwarz inequality or on the right, we get after ‘dividing — 
by ||y|| that 


bllyll < Izl. (21) 
Let {zn} be a sequence of vectors appearing in (20), with corresponding yn: 
B(x, Yn) = (x, Zn). (22) 


Subtraction and skew linearity gives B(x, Yn — Ym) = (X, Zn — Zm). By (21), bliy; — 
Ymll < lza — zmll. From this it follows that if zn converges to z, the corresponding 
Yn form a Cauchy sequénceé. Since H is complete, the sequence {yn} converges to 
a limit y. It follows from (17) that the left side of (22) converges to B(x, y), and it 
follows from (4) that the right side converges to (x, z). So 


B(x, y) = (x, 2), 


which proves that the set of z appearing in (20) form a closed subspace of H. 

We claim that this closed subspace is all of H ; for if not, then according to theo- 
-Tem 3 there would be a nonzero vector x orthogonal to all z. It foHows-from (20) that 
such an x satisfies B(x, y) = 0 for all y. Setting y = x gives B(x, x) = 0; using 
(18), we get ||x|| = 0, contrary to x Æ 0. 

According to theorem 4, all linear functionals £(x) can be represented as (x, =), 
zin H. Combined with (20), this establishes (19). It follows from (18) that y is 
uniquely determined. Cl 


6.4 LINEAR SPAN 


We recall from chapter | that the linear span of a collection of points {yj} = Sis 
the smallest linear subspace containing them. The closed linear span of a collection 
of points S in a Hilbert space H is defined to be the smallest closed linear subspace 
containing S, that is, the intersection of all such subspaces. 


Exercise 5. Show that the closed linear span of a set is the closure of its linear span. 


LINEAR SPAN l 59 


Theorem 7. The point y of a Hilbert space H belongs to the closed linear span Y 
of the set {y;} iff every vector z that is orthogonal to all yj is orthogonal also to y: 


(y,z)=0 forallz that satisfy (yj,z)=0 forall j. (23) 


Proof. We claim that the set Z of vectors z orthogonal to all y; form the orthog- 
onal complement of Y. Since every vector z orthogonal to all yj is orthogonal to 
all linear combinations of y;, and by continuity to limits of linear combinations, 
Z c Yt. Conversely, every vector in Y+ is orthogonal to all the y;, and so belongs 
to Z. This shows that Z = Y+. We appeal to part (iii) of theorem 3 to conclude that 
Y= (yt)+ E zt, as asserted in theorem 7. Oo 

We stated in chapter 5 that every isometry of a Banach space onto itself that maps 
0 into 0 is linear. We give now a new proof of this in Hilbert space: 

Denote by x —> x’ an isometry of a Hilbert space that maps 0 — 0. Let x, y be 
any pair of vectors, x’ and y’ their images. Since distances are preserved, d(0, x) = 
d(0, x’), d(0, y) = d(0, y’), and d(x, y) = d(x’, y), which can be expressed as 


ix] =x’, Uy = ty’, (24) 


lx — yl? = Ix’ — y'i’. (24") 
Expanding both sides in (24’) and using (24), we get 
(x, y=", y’). (25) 
Now denote x + y by z, and let u be any vector in H. Using (25) we have 
(z'u) = (z, u) = (x + y, u) = (x, u) + (y, u) = (x, u) + (y, Ww) = (x H y, u’). 
Thus 


(z’—x'~ y',u') =0 


for all u’. This can be only if zZ =x +y. 


The virtue of this proof is that it applies even when the scalar product is not 
- positive, as longas it is nondegenerate, meaning that no u is orthogonal to all points. 


(xj,x) =O forjgk, and lx; =1 forall j. (26) 


Definition. A collection of vectors {x j} is called an orthonormal base if the vectors. — 


are orthonormal, and if the closed linear span of {x;} is the whole space. 


& 


60 HILBERT SPACE 


Deo, Lemma 8. Let H denote a Hilbert space, {x j} an orthonormal set in H. The closed 
——inearspanof {xy} consists ofal vectors'ofthe form 


x=} ajxj, (27) 


where the a; are complex numbers so chosen that’ 


The sum (27) converges in the sense of the Hilbert space norm. Furthermore 


Ixl? =$ lal, . (28) 


and 

aj = (x, xj). DOON (28°) 
Exercise 6. Prove lemma 8. 
Theorem 9. Every Hilbert space contains an orthonormal basis. r 


Proof. Consider all orthonormal sets, partially ordered by inclusion. Given a to- 
tally ordered collection, the union of all vectors contained in the sets in the collection 
includes all of them. Therefore, by Zorn’s lemma, there is an orthonormal set that is 
maximal. We claim that the closed linear span X of a maximal orthonormal set {xj} 
is the whole space. We argue indirectly: suppose that there is a y that does not belong 


to the closed linear span X. Define aj by 


aj = (y, xj). (29) 
We claim that Bessel’s inequality holds: 
Slaj? < Iyl; GO) 
for consider 
lly — J ajxjl?, (31) 
F 


where F means a finite collection of j. Using the orthonormality of {x;}, we find 
that (31) equals 


DI-A aor- ase. y) + Yay, 
F F F 
which by (29) equals 


lly? -$ laj. 
F 


LINEAR SPAN 61 


Since (31) is nonnegative, (30) follows for every finite collection F, and therefore 
for the infinite sum. 

It follows then from lemma 8 that we can define a vector x by (27), and that x 
belongs to X. Now using (29) and (28’), we have 


(y= xxj) = (y, xj) — (x, xj) = aj — aj =0, 


meaning that the y — x is orthogonal to all x ;. The difference y — x is not zero, since 
by assumption y does not belong to X, while x does; so 
yrx 
lly = 


could be joined to the orthonormal set {x j}, enlarging it, contradicting maximality. 


g 


Suppose that H is a separable Hilbert space; that is, it contains a denumerable 
set of points that is dense. In this case every orthogonal basis is denumerable, and 
the basis elements can be constructed without appealing to transcendental arguments 
such as Zorn’s lemma: 


Theorem 9’. Let {y j} be a sequence of vectors in a Hilbert space whose closed 
linear span is all of H. Then there exists an orthonormal basis {x j} such that the 
linear span of [x1], . . . , Xn} contains y4, ..., Yn- 


Exercise 7. Prove theorem 9’. 


The construction of the orthonormal basis {x j} in theorem 9 is called the Gram- 
Schmidt process. 


Exercise 8. Let H be a Hilbert space; prove that any two orthonormal bases į in H 
have the same cardinality. 


Theorem 10. Let H denote a Hilbert space, {x j} and {y j} two orthonormal bases. 


According to theorem 8, every x can be written as 


Then the mapping 


is an isometry of H onto H, mapping 0 + 0. F. urthermore every isometry of H onto 


H mapping 0 — 0 can be obtained in this fashion. ~ ~ 


Exercise 9. Prove theorem 10. 


62 HILBERT SPACE 


Exercise 10. Show that every infinite-dimensional separable Hilbert space is iso- 
_... morphic with the space £? consisting of all vectors with infinitely many components: 
x = (aj, az, ...), subject to the restriction |x|? = £ |a;|* < ©. 


NOTES. The abstract notion of Hilbert space described in this chapter is due to von 
Neumann in 1929. Earlier, Hilbert and his school had used the concrete spaces-de-------~-— 
-scribed in Examples.2.and.3; hence the name... ee 
Theorem 2 is essentially due to Beppo Levi, in a concrete context. 


BIBLIOGRAPHY 


Frechet, M. Sur lés opérations linéaires, III, Trans. AMS,.8 (1907): 433-446. 
Hilbert, D. Grundzüge einer allgemeinen Theorie der linearen Integralgleichungen. Teubner, Leipzig, 
1912. 


Lax, P. D. and Milgram, A. Parabolic Equations. Contributions to the Theory of Partial Differential Equa- 
tions. Annals of Math. Studies 33. Princeton University Press, Princeton, 1954. 


Levi, B. Sul Principio di Dirichlet. Rend. del Circolo Mat. di Palermo, 22 (1906): 293-300. 


Riesz, F. Sur un espiece de géométrie analytiques des systemes de fonctions sommables. C.R. Acad. Sci. 
Paris, 144 (1907): 1409-1411. 


von Neumann, J. Allgemeine Eigenwert-theorie Hermitescher ee tinietonaleperatoren: Math. An., 102 
(1929): 49-131. 


7 


APPLICATIONS OF HILBERT 
SPACE RESULTS 


7.1 RADON-NIKODYM THEOREM 


Let v and yz be finite nonnegative measures on the same o-algebra. v is said to be 
absolutely continuous with respect to u if every set that has jz-measure zero has v- 
measure zero. The Radon-Nikodym theorem asserts that such a v-measure can be 
expressed as 


v(E) =a gdh, (1) 
E 


where g is a nonnegative integrable function with respect to u. 

Von Neumann showed how to derive this from the Riesz representation theorem 
for linear functionals in Hilbert space: 

Let H be the real Hilbert space L(y + v), with the norm 


IP? = / duty), 2) 


Assume, for simplicity, that the jz and ù measure of the whole space is finite; then 
it follows, via the Schwarz inequality, that every square integrable function is inte- 
grable. The linear functional 


x)= na fea dp B) 
is bounded with respect to the L? 2(y)- -norm, so even more with respect to the L? (u+ 
v)-norm. Then, by theorem 4 of chapter 6, (x) can be represented as a scalar product 
(x, y) for some y in L? (u + v): 


A o T an= f yag +v); 


63 


64 APPLICATIONS OF HILBERT SPACE RESULTS 


y depends only on the measures u and v. We rewrite this as 


[+0 —y)du= foe. (4) 


We claim that 


except for a set of -measure zero. To show this, we denote by F the set on which 
y < 0, and claim that 


Set x = 1 on F, x = 0 off F; with this choice, (4) becomes 


f E f ydv. (7) 
F F 


Since y < 0 on F, the right side of (7) is < 0, while the left side is > w(F); this 
proves (6). 

Denote by G the set where y > 1; suppose that u(G) > 0. Setx = lonG,x =0 
off G; with this choice (4) becomes 


paa ydv. (8) 
G G 


Since y > 1 on G, the left side of (8) is negative, and the right side is positive, a 
contradiction. This completes the proof of (5). 
We modify, if necessary, the function y on a set of -measure 0 so that (5) holds 


_ everywhere. Since v is absolutely continuous with respect to yu, this does not af- 


fect (4). 
We claim that the function g in (1) is given by g = (1 — y)/y. To see this, denote 


u = xy, and rewrite (4) as 
fusdu=fuav. (9) 


Let E be any measurable set; we choose x so that x is 1 on E, 0 off E. Then (9) gives 


[edu = ve), (10) 
E 
This is relation (1). g 


Exercise I. Prove the Radon-Nikodym theorem for measures that are only o-finite. 


MF) = 0-0 ens) 


DIRICHLET’S PROBLEM 65 
7.2 DIRICHLET’S PROBLEM 
First, let D be a bounded domain in R”. Denote by C3" D) the space of real-valued 


infinitely differentiable functions f whose support is contained in a compact subset 
of D. On the space CoD) we introduce two scalar products: 


(f.eo= f feds and n= f I feds (11) 


where fj = 0f/dxj,j =1,...,n 


Exercise 2. Verify that CP) is an inner product space under each of these scalar 
products. 


The following inequality connecting these two norms is due to Zaremba: 
Lemma 1. For all f in C5°(D), 
Flo < dil fh. (12) 
where d is the width of D. 


Proof. Since f is zero on the boundary of D, at any point x in D, 
x 
fœ = f fdn 
x 


where x? isa boundary point of D with the same x2, ..., Xn coordinate as x. Apply- 
ing the Schwarz inequality above gives 


Pe) sa f IAP dn. 
Integrating this over D gives (12). E 


--—Denote- by he completion of ERED wirespect tthe worm by irits — 


completion with respect to.the-norm-|||| 9 =-=--=-==--=- see ere es 


Lemma 2. Every element v of H? belongs to Ho, and has partial derivatives v; o 
ry 1 P J 
first order that belong to Ho; these partial derivatives satisfy 


(z, vj)o = —(8z/dx;, Ug ee N retary 


for any C9 function z. Furthermore formula (11) holds for f, g in H 


a 


66 APPLICATIONS OF HILBERT SPACE RESULTS 


Proof, Let {y")} be a sequence of C5° functions that tends to v in the | ii norm. 


(n) Pp ee eer a 


That means that the first derivatives v;'“ converge in the ||||9 norm; we call these 
limits vj. By lemma 1, {v“")} converges in the |||Io to a limit in Ho, which we identify 
with the limit v in H Integration by parts gives relation (13) for v) in place of v; 
letting n —> oo gives (13); relation (11) follows similarly. o 


We claim that the identification of elements v in H. ? with elements of Ho is one- ~~ 


to-one, that is, an embedding of H 9 in Ho. We have to show that if v is zero in Ho, 
then it is zero in H : . Clearly, it follows from (13) that if v = 0 in Ap, then vj = 0 in 
Ao for all j. This makes v = 0 in oh 

Relation (13) asserts that v; are the first partial derivatives of vin the sense_of_. 
distributions (see Appendix B). $ 

Let f be any element of Ho; define the linear functional £ by 


tu) = (u, fro. (14) 
By the Schwarz inequality, and inequality (12) of lemma 1, 
KDI < Wf llollello < Zl filolle lh (15) 


for all u in H o . According to the Riesz-Frechet representation theorem, theorem 4 of 
chapter 6, the functional (14) can be represented as an inner product. That is, there 
exists v in H Y such that 


(u, f)o = (u, v)i (16) 
for all u in HÌ. By definition (11) and lemma 2, with u; = du/dx;, 
u,v) =$ ujo = a7 


Take now u to-be C§°. We can, using the theory of distributions, rewrite the right - 
side of (17) as 


-= $ (u, vjj)o =U, Avdo, 7) 


where vj j are the second partial derivatives of v, and A the Laplace operator, acting 
in the sense of distributions. Combining (17), (17°), and (16), we deduce that 


(u, Fo = —(u, Av)o 
for all u in Co’: From this it follows that in the sense of distribution theory, 
f =—Av. (18) 


Thus v is a distribution solution of the inhomogeneous equation (18). 
Next we show that by virtue of belonging to H A u(x) tends to zero in an average 
sense as x tends to the boundary of D. The precise statement is lemma 3: 


DIRICHLET’S PROBLEM 67 


Lemma 3. Suppose that D is a domain in R? whose boundary 8D is aC! curve. 
For any point p on the boundary of D, choose a coordinate system x4, x2, with p as 
the origin, and the positive x axis perpendicular to the boundary of D and pointing 
inward. Denote by R(p, d) all points of D where x, < d and |x| < d. Let v be any 
function in H 0 We claim that the mean value of |v] over R(p, d) tends to zero as d 
tends to zero. 


Proof. Since the area of R(p, d) is proportional to d?, the claim is that 


[iwi dx < o(d). (19) 


f flax 


for functions f in c9 (D). Integrate by parts with respect to x1: 


To deduce this, we need to estimate 


firias= f ifia- sass f ifild -slax 


1/2 1/2 
< (J (d—s?dx f flax) <d? (J flax) ; (20) 
R R 


in the second step we have used the Schwarz inequality. Now approximate v in the 
1-norm by a sequence v™® of C9 (D) functions. The limit of (20) for f = y) is 


1/2 
[vise (J otas) . (21) 
R R 


Since the integral of v? over R(d) tends to zero as d tends to zero, (19) follows from 
(21). | 0 


We have thus succeeded in constructing a generalized function v that solves the dif- 
ferential equation-(f8)-in the sense-of-distributiom theory, and-that vanishes on the 
boundary in a mean value sense. 

The argument above can be extended to solve the Dirichlet problem in this gener- 
alized sense for any second-order partial differential equation that is self-adjoint and 


positive. We show now how the Lax-Milgram lemma can be used to extend the argu- 


*~—=ment above to non=self-adjoint partial-differential: operators: For example, consider 


for any pair u, vin H Y the functional B defined by 


B(u, v) = / (u vj + X wj + uv) dx dy. (22) 
D 


E ~ Clearly, B is bilinear; it follows from lemma 1 that it is bounded. Estimating the 
middle term by the Schwarz inequality, we see that B is positive in the sense of 


68 . APPLICATIONS OF HILBERT SPACE RESULTS 


~ theorem 6 of chapter 6. It follows then from that theorem that the linear functional 
(14) can be represented in terms of B, meaning that there exists a v in H? such that 


(u, f)o = Bu, v) (23) 


won for-all-u-in HY. Integrating by parts on the right in (23), we deduce that 


fs Aut DT a eee ar ae EN 22) 


in the sense of distributions. 
We present now another method for solving the Dirichlet problem for the homo- 
geneous Laplace equation 


Av=0 in D (25) 


whose value on the boundary is prescribed. This method exploits some special prop- 
erties of harmonic functions and yields genuine solutions that satisfy the boundary 
condition in the usual sense. We assume that 3D is once differentiable, and that the 
boundary values of v are also once differentiable. We can then construct a C! func- 
tion f on D U 3D that has the prescribed value on 0D, and we state the boundary 
condition thus: i 


— v= f on-ðD. -= --(26) 
We reformulate the boundary value problem (25), (26): decompose f as 
f=v+z, (27) 


where v is harmonic and z vanishes on dD. When v and z have continuous first 
partial derivatives up to the boundary, we can apply Green’s formula: 


dv 
vz) = v;zjdxdy=— A dx dy —zds. 
(v, z)i ie jzjdxdy fy v)z yt foe 
Since by (25), Av = 0 in D, and by (26), z = 0 on ðD, the right side = 0. In 
words, the space of harmonic functions and the space of functions that vanish on the 
boundary are orthogonal to each other in the ( , ), scalar product. 

Performing the decomposition (27) thus appears as the task of splitting f into the 
sum of two functions from two orthogonal function spaces. We show how this can 
be accomplished by appealing to theorem 3 of chapter 6. The scalar product defined 
by (11) for all f in C°°(D) is not positive because (f, f); = 0, not only for f = 0 
but for all constant functions f. We overcome this slight blemish by considering two 
functions as equivalent if they differ by a constant. 

Denote by H the completion in the |||]; norm of all C! functions on D U ðD; the 
space H 9 is a closed subspace of Hı. We apply now the orthogonal decomposition 
theorem, theorem 3 of chapter 6, to conclude that every f in H can be decomposed 


DIRICHLET’S PROBLEM 3 . 69 


uniquely as in (27), where z in H 0 vLH o . The condition v LH n asserts that 
(u,v), = S iuj, vj)o = 0 


for all u in He. Taking u to be Co. we can integrate by parts in the sense of distri- 
butions to get 


0= X uj, vj)o = -5 (u, vjj)o = —(u, Av)o, 


which implies that 
Av=0 


in the sense of distributions. It is a well-known result of Hermann Wey] that a func- 
tion harmonic in the distribution sense is harmonic in the classical sense; a proof is 
provided in section 4 of Appendix B. 

We claim that when f is continuous up to the boundary, z(q) tends to zero as 
q approaches the boundary. Denote by d the distance of q to ðD, and let C be the 
circular disc with center q and radius r = d/2 and denote mean values over C by 
bars: taking the mean value of (27), we obtain 


F(q) = zla) + 5(q). (28) 


Since f is continuous up to the boundary, its mean value on C differs from its value 
at the center q of C by an amount w that tends to zero as d tends to zero. According 
to the elementary theory of partial differential equations, the mean value of the har- 
monic function v on a circular disc C is equal to its value at the center of C. So (28) 


can be rewritten as 
f(q) = Zla) + v(a). 
Subtracting (27) from this, we conclude that 


w = 2(q) — 2(q). (29) 


to-zero: It follows that the-mean value of |z| over the disc C tends to zero. Combined 
with (29), we conclude that-z(q)- itself tends-to-zero-as_q-approaches-the-boundary-—————— 
= Using (27), we.see.that the harmonic function.v_is continuous.up.to-the. boundary, and—~——— - 

its boundary value equals f : Thus we succeeded in constructing not a generalized, 
but a genuine solution of the Dirichlet boundary value problem. 

Here is another way of looking at (27). We saw that the boundary value problem 
(25), (26) for the Laplace equation amounts to decomposing f as a sum of a har- 
monic function and_of one vanishing on the boundary..We-showed_that-these-spaces—— _ 
are orthogonal to each other in the ( , ) scalar product. The decomposition was 
accomplished by appealing to the orthogonal decomposition theorem, theorem 3 of 


70 APPLICATIONS OF HILBERT SPACE RESULTS . 


chapter 6. Here we show how to perform an orthogonal decomposition with respect 


- to-the-subspace V consisting of harmonic functions whose first derivatives are square 


integrable in D. It is an easy fact in the theory of harmonic functions that V is com- 
plete in the ||||;-norm. So, according to theorem 3 of chapter 6, we can decompose 
any f in Hj as 


ists E clase aie E E Meee eg é (30) 


where v Is in the space V of harmonic functions with square integrable first deriva- ~~ ~ 


tives, and z is orthogonal to V. Our aim isto show that z vanishes on the boundary. 

We take D to lie in R?, and assume that the boundary of D is twice differentiable; 
we assume f to be twice differentiable. For any.pair of points p and q in the plane, 
we define k(p, q) as the fundamental singular solution of the Laplace equation: 


1 
k(p, q) = zy los ip = ql. (31). 


Suppose that q lies in D. If we knew that the function z vanished on the boundary of 


A E E. | 


D, then by Green’s formula, see section 4 of Appendix B, we would deduce that 
2(q) = f (Zx kx + zy ky) dx dy, 
D 


where (x, y) are the coordinates of p. However, we do not at this point know that z ~ 
vanishes on 0D, and so we denote the function defined by the integral above as u: 


i EA on 
ieee a (a Ea A fe) dx dy, - (32) 
2m Ip — ql“ Ip- q4}? 


where x’, y’ denote the coordinates of q. 


Lemma 4. If @D is twice differentiable, u(q) defined by (32) is continuous up to the 
boundary and vanishes there. 


Proof. It is easy to show that u is continuous inside D. Let q be a point of D near 
the boundary; denote the nearest boundary point by b. Since dD is assumed to be 
twice differentiable, there are two circular disks S and S with the same radius, d, 
tangent to ðD at b, S contained in D and S exterior to D. Denote by 7 = (F, F) the 
image of q under inversion across the circle bounding S. For q near enough to aD, 
g lies in 5. 

Since q lies outside D, k(p, q) is a regular harmonic function in D. In particular, 
k(p, q) belongs to V, and thus is orthogonal to z in the ( , ) scalar product. 
Therefore u(qg) = 0, and we can write 


] ot des. as ene as 


y -y y-y 
+z (= = 225) dx dy. (33) 
p- ip—aP 


BIBLIOGRAPHY 7 


As q approaches the boundary point b, so does the point g. Therefore the integrand 
on the right in (33) tends to zero uniformly at all points of D whose distance from 
b exceeds any positive quantity r. It can be shown (see Lax for details) that also the 
integral over the remaining portion of D tends to zero. This completes the proof of 
lemma 4. J 


Lemma 5. The function u defined in D by (32) is twice differentiable in D, and 
Au = Az. (34) 


Proof. Since f was assumed twice differentiable, and since by (30), z differs from 
hey f by a harmonic function v, it follows that z is twice differentiable in D and that 
p l Az = Af. Let D’ be a subdomain of D that contains q, and whose closure is con- 
> tained in D. We split the integral on the right of (32) and integrate by parts over D’: 


ð 
u(q) = 2(q) + J z——-kds + J (zxkx + zyky) dx dy. (35) 
ap’ an -D f 
The two integrals on the right are harmonic functions in D’; therefore, since D’ is 
arbitrary, 
u=zth, (36) 
h harmonic. This proves lemma 5. g 


Express z from (36) as u — h and set into (30): 
f=v-h++u. 


Since v — h is harmonic, and u = 0 on ðD, v — h solves the boundary value problem 
(25), (26). el EED . . 
The preceding proof is a reworking of an argument of Garabedian and Schiffer. 


BIBLIOGRAPHY ; aen iik 


Garabedian, P. and Schiffer, M. On existence theorems of a ja and conformal mapping. An. 
Math., 73 (1950):--107-121. 


——-Lax-P-D-Aremark-onthe-method-of-orthogonal projection- CPAM -ALISIN-AS Thb h maamaa 
von Neumann, J.-On rings-of operators: H-An-Mathy 4.¢1940):-94—1 6:45: see prt} Tirreni n ms seima yee 
Nikodym, O. M,.Sur une généralisation des integrals de M. J. Radon. Fund. Math., 15 (1930): 131-179. 


Radon, J. Theorie und Anwendung der absolut additiven Mengenfunktionen. S. B. Akad. Wiss. Wien, 122 
(1913): 1295-1438. 


Weyl, H. The method of orthogonal projection in potential theory. Duke Math. H 7 (1940): alee 


Zaremba, S. Sur le Principe de minimum. Krakauer Akademieberichte, 1909. 


8 


-DUALS OF NORMED > 


—EINEAR SPACES == © oee 


8.1 BOUNDED LINEAR FUNCTIONALS 


In this chapter we deal with normed linear spaces X over the real or the complex 
numbers. We will study linear functionals, namely mappings £ of X into R or C 
satisfying 


Elax) =al(x), L(x + y) = £x) +£), (1) 
that are in addition continuous. They are continuous in that they satisfy 


lim (xn) = £(x) when lim |x, —x| =0. (2) 
n->00 n=>00 


Definition, The collection of all continuous linear functionals i is called the dual of 
X. Itis denoted by X’. j 


Clearly, the sum and constant multiple of continuous linear functionals is contin- 
uous and linear; thus X’ is a linear space. 


Definition. A linear functional £ on X is called bounded if there is a positive number 
c such that 


\€(x)| < clx| for all x in X, (3) 
where | | on the left denotes the absolute value. 
Theorem 1. A linear functional £ on X is continuous if and only if it is bounded. 
Proof. Set xn — x = yn in (2); using (1) and (3), we get 
lE(xn) — Ex) = On) < elynls 
this shows that boundedness implies continuity. 


72 


BOUNDED LINEAR FUNCTIONALS 73 


Suppose that £ is not bounded; then for any choice of c = n, (2) is violated by 
some Xp: 


£(xXn) > nxn. 


Clearly, x, can be replaced by any multiple of xn; if we normalize xp so that 


I 
X,»|=—, 
| n | Jn 
then xn — 0 but £(x,) — oo. This shows that lack of boundedness implies lack of 
continuity. o 


Theorem 2. The nullspace of a bounded linear functional £ on a normed linear 
space is a closed linear subspace. For £ nontrivial, meaning Æ 0, the nullspace has 
codimension 1. 


Proof. The nullspace of any linear map is a linear subspace. Since a bounded 
linear functional is continuous, it follows that the inverse image of 0 is closed. That 
for £ Æ 0 the nullspace has codimension | is immediate. o 


Definition. The norm of a bounded linear functional is the smallest c for which (3) 
holds; it is denoted as |£|: 


Hen KON. 


x0 lx] i 


by homogeneity, we may take x to have norm equal to 1. 


Theorem 3. The dual X’ of any normed linear space is a complete normed linear 
space under the norm defined by (4). 


Proof. Homogeneity and positivity are obvious. For subadditivity, consider two 
bounded linear functionals £ and m: 


|€+m| = sup |(€+m)(x)| < sap GECO + mC) 
Ixl= 


|x|=1 


< sup Eœ) + sup OL = jle] + iml. 


Eei a ia auate Stas eel 


We show now ones let {€m}.be a Cauchy sequence in X’: 
|En TR Eml = 0 as n m= Oo. (5) 
According to definition (4) of the norm for functionals and (5), 


(ln — €m)(x)| = Jen (x) — Eml < En — Em | lx] —> 0 as n,m —> œ, 


we 


& 


74 DUALS OF NORMED LINEAR SPACES 


for every x in X. Since the field of scalars, R or C, is complete, 


lim Ly (x) = L(x) oe 
n> OO 


exists. It is easy to show that £(x) is linear and bounded, and it is not hard to deduce 


from (5) that if |En — £m| < € for m > n, then also |£, — £| < €. Therefore’ 


lim |£, — £| =0. ee ee eee ee 


8.2 EXTENSION OF BOUNDED LINEAR FUNCTIONALS 


So far we have not shown the existence of a single linear functional except £ = 0. 


Certainly there are lots of them in a Hilbert space; we show now that there are just as 
many in a Banach space. The tool needed is the Hahn-Banach theorem, specialized 
to the case 


p(x) = clx]. 


Theorem 4. Let X be a normed linear space over the real or complex numbers, Y a 
subspace, and £ a linear functional defined on Y and bounded there: 


LO) < ely, yin Y. 


Then £ can be extended as a bounded linear functional to all of X so that its bound 
on X equals its bound on Y. 


This theorem is a special case of theorem 8 of chapter 3. We give now some 
applications. 


Theorem 5. Say that y,,..., yy are N linearly independent vectors in a normed 
linear space X, a,,...,@y arbitrary complex numbers. Then there exists a bounded 
linear functional £ such that l l 


L(yj) = aj, Pe lhe tN. (6) 


Proof. Denote by Y the linear space spanned by y),..., Ym; it consists of vectors 


of the form 
y=} bjyj. 


Since the y; are linearly independent, this representation of y is unique. Now define 
£on Y by 


L(y) = X by aj. 


Clearly, £ is linear and bounded on Y, and satisfies (6); by theorem 4, it can be 
boundedly extended to all of X. oO 


EXTENSION OF BOUNDED LINEAR FUNCTIONALS 75 


Corollary 4’. Every finite-dimensional subspace Y of a normed linear space X has 
a closed complement. 


Proof. Choose a basis y4, ..., yy in Y. According to theorem 5, there exist M 
bounded linear functionals ĉj, j = 1,..., N, such that 


bj (Ye) = jk; 
according to theorem 2, the nullspace Zj of £j is closed. So then is their intersection 
Z=2,N...NZy. 


It is easy to check that Z and Y are complementary, namely that X = Y @ Z. g 


Theorem 6. For every y in a normed linear space X over the real or complex field, 


b| = ma KO). (7) 


Proof. By definition (4) of |£], |2(y)| < 141 Iyl. It follows that the right side of (7) 
is < the left side. Therefore to prove the result, we have to exhibit for every y in X 
an £ in X’ such that 


£(y)=lyl, lél= 1. 


To accomplish this, we note that this defines £ on all scalar multiples Y of y as 
(ay) = aly|. Clearly, on this one dimensional space Y, € has norm = 1. By theo- 
rem 4, £ can be extended to all of X so that |£] = 1. 


g 
Corollary 5’. When the field of scalars is R, for every x in X 
x| = max €(x). 
jx] ma (x) (8) 
The following is a far-reaching generalization of theorem 6. 


Theorem 7. X is a normed linear space over C, Y a linear subspace of X. For any 
z in X, denote by m(z) its distance from Y: 


a- ee) inf e 
yinY 


We claim that for every z in X 
m(z) = M(z), (10) 
where 


M(z) = [e(z)I. (11) 


max 
[e|<1,@€=0onY 


76 DUALS OF NORMED LINEAR SPACES 


Proof. Since the functionals £ entering the maximum problem (11) vanish on Y, 


and since |£| < 1, |€(z)| = |€(z — y)| < |z — y| holds for all yin ¥; therefore __ 


l€(z)| < inf |z — y| = m(z). 
yinY 
It follows from this and the definition (11) of M(z) that 


wine RAR. ated we Se, ana eM) C g a aa | a9) 


To show equality, we look at the linear space Yọ consisting of all vectors of the form 
y + az, y in Y, a complex, and define on Yọ the linear functional £ọ: 


£o(y + az) = am(z). a ae i (13) 


By definition (9) of m, it follows that £9 is bounded on Yọ by 1; so by theorem 4, it 
can be extended to all of X so that |£9| = 1. Set y = 0, a = 1 in (13): 


Lolz) = m(z). 


Combined with (12) this shows that £9 solves the maximum problem (11), and that 
(10) holds. o 


REMARK 1. Incase Y is the trivial subspace consisting of {0}, theorem 7 reduces to 
theorem 6. i 


Theorem 7 is an example of dual variational problems: a pair of a minimum and 
a maximum problem whose extreme values are equal. 


Definition. The set of linear functionals £ that vanish on a subspace Y of X is called 
the annihilator of Y , and is denoted by Y+. 


Exercise 1. Show that Y+ is a closed linear subspace of X’. 


Exercise 2. Let Y be a closed subspace of a normed linear space X. Show that the 
dual of (X/Y) is isometrically isomorphic with ¥ +. 


Theorem 7’. X is a normed linear space over C, Y a subspace of X. For any £ in 
X', define 


tly = sup |£). (14) 


yin 
tyl=1 


We claim that 


l£ly = min |€—m|. (15) 
min Y+ 


EXTENSION OF BOUNDED LINEAR FUNCTIONALS 77 


Proof, For any m in Y+, and any y in Y with |y| = 1, 
EOI = E- m)(y)| < £ — mI. (16) 


It follows that |£|y is < the right side of (15). 
According to theorem 4, the restriction of £ to Y has an extension to X, call it ĉo, 
whose norm on X equals its norm on Y: 


lo] = Ely. (17) 
Since £9 and £ are equal on Y, £ — ło = m belongs to y+; furthermore by (17), 
| — m| = |£ol = |ély. 7) 
This combined with (16) proves that equality holds in (15). o 
Theorem 7’ is another example of dual variational problems. 
Exercise 3. Show that Y’ is isometrically isomorphic with X’/Y+. 


Definition. The closed linear span of a subset {y;} of a normed linear space is the 
smallest closed linear space containing all yj, that is, the intersection of all closed 
linear spaces containing all y;. 


Exercise 4. Show that the closed linear span of {yj} is the closure of the linear span 
Y of {yj}, consisting of all finite linear combinations of the y;: 


y=) ayy: (18) 
F 


The following result, called the spanning criterion, is one of the workhorses of 
functional analysis. 


Theorem 8,_A point z of a normed linear space X belongs to the closed linear span 
Y of a subset {y;} of X iff every bounded linear functional £ that vanishes on the 
subset vanishes at z; that is, 


gs OVI) _Joralyj os oe nein (19) 


implies that €(z) = 0. 


Proof. Since £ is linear, (19) implies that £(y) = 0 for y of form (18); since £ is 
continuous, it vanishes on all limits of points of form (18). Conversely, suppose that 
z does not belong to the closed linear span Y of {y;}; then 


inf |z — y| =d > 0. l (20) 
yinY 


78 DUALS OF NORMED LINEAR SPACES 
Define the subspace Z to consist of all points of the form 

y+ az, yin Y, (21) 
and define on Z the linear functional £9 by 


lo(y az) =a. 


It follows from (20) that 
ly +az| > dļaļ. 


Combining this with the definition of £9, wé deduce that on Z, £ọ is bounded by d~!. 
So by theorem 4, £9 can be extended boundedly to all of X. By definition, 


foQy)=0 forall yj, £9(z) = 1. o 


NOTE. Theorem 8 is a generalization to Banach spaces of theorem 7 in chapter 6 on 
Hilbert spaces. 


8.3 REFLEXIVE SPACES 


The dual X’ of a normed linear space has its own dual, denoted as X”. Since £(x) is 
a bilinear function of £ and x, and is bounded, definition (4), it follows that, for fixed 
x, £(x) is a bounded linear functional of £. It follows from theorem 6 that the norm 
of this linear functional is |x|. Thus the space X is, in this natural way, isometrically 
embedded in X”. It is a basic result of the theory of finite dimensional vector spaces 
that X” = X. This is no longer true for all Banach spaces. 


Definition. A Banach space is called reflexive if X” = X, that is, if X is all of X”. 
Theorem 9. Every Hilbert space is reflexive. 
Proof. This is an immediate consequence of theorem 4 in chapter 6. o 
The following result is due to Milman: 
Theorem 10. A uniformly convex Banach space is reflexive. 
For proof ‘is refer to Milman. 
We stated in chapter 5 that the L? spaces, 1 < p < oo, are uniformly convex. 


Combining this result of Clarkson’s with the preceding result of Milman, we con- 
clude that LP, 1 < p < oo, are reflexive. 


REFLEXIVE SPACES 79 


Theorem 11. The dual of LP is L4, 


Proof. We saw in chapter 5 that for any u in L1 we can define a bounded linear 
functional £ on L? by 


e(f)=(fiu)= J f(s) u(s)dm. 


Furthermore we showed, as theorem 5 of chapter 5, that the norm of this linear func- 
tional is |u|y. Thus L4 is isometrically embedded in (L”)’. We claim that L4 is all 
of (LPY; for if not, there would be some z in (LPY not in L1. Since L4 is closed, it 
would follow from the spanning criterion, theorem 8, that there is an £ Æ 0 in (L”)” 
such that £(w) = O for all u in L4. Since LP is reflexive, £ lies in L”, and it follows 
that (£, u) = 0 for all u in L1. By theorem 5 of chapter 5, this implies that £ = 0, a 
contradiction. ; o 


We give now a second proof that for p < 2 the dual of L? is L7, without appealing 
to uniform convexity. 

We assume for simplicity that the total measure with respect to which we form 
the LP norms equals 1. Then, by Hölder’s inequality with p’ = 2/p, q’ = 2/(2—p), 


If lp = | [fl?dm < Wloyea-pllf? lz/p = IF. 
So [flip < If ll2 for all f in £?. 


Let £ be a linear functional defined on all L? functions that is bounded in the L? 
norm: 


ECAI < const. If lp. 


Since the p-norm is less than the 2-norm, £ is also bounded in the L?-norm. Accord- 
ing to the representation theorem, theorem 4 in chapter 6, we can express £ as 


_ ef) 2 ftuan, u inL? S TO 22 


We claim that in fact u lies in L4 . To see this, we choose f= Sk as follows: . 
falx) = ugli! (x) sgn u(x), 
where 


[itz |(x) = min{|u(x)], k}, 


80 l DUALS OF NORMED LINEAR SPACES 
k a constant. Setting f = fy in (22) gives 
Ulf) = f feudm = J uf") juldm > I Jup|?dm. 


On the other hand, 


R EE De Wall, = J ue -DPam = f tlam. 


Since, by assumption, |£( F |-< cl fyl p» the last two inequalities imply that 


1/p 
f iuam <c (/ ial? : 


Dividing both sides by the right side ‘gives 
ee le sc. 
Letting k — oo we deduce that u is in Ly. This completes the proof. 0 


Exercise 5. Show that if the total measure equals 1, then || f || p is an increasing func- 
tion of p. 


Theorem 12. C[—1, 1], normed by the maximum norm, is not reflexive. 


Proof. If it were, C would be the dual of C’. According to theorem 6 applied to 
X = C’, for every £ in C’ there is an f in C” = C such that 


I=), |flmax = 1. (23) 


Now define 


0 ] 
tos f swa- f g(t) dt. 


Clearly, for every g in C[—1, 1], 


l€(g)| < 218 |max, (23) 


but given any € > 0, we can choose g so that 


l€(g)| > (2 — €)|2lmax- 


This result shows that |£| = 2. Along with (23’), it contradicts (23) for g = f. D 


Theorem 13. Let Z be a normed linear space over C. If Z' is separable, so is Z. 


REFLEXIVE SPACES E 81 


Proof. Separability means that Z’ contains a dense denumerable set {£n}. By def- 
inition of the norm in Z’, there is a zn in Z such that 


Izal = 1,  en(Zn) > 41enl. (24) 


We claim that the denumerable set {zn} has Z as its closed linear span. According 
to theorem 8 this means that a linear functional £ that vanishes on every zn vanishes 
everywhere. Suppose, on the contrary, that there is an £ such that 


Elza) =0 for all n, and |2| = 1. (25) 
Since {£,} are dense in Z’, we can find an £, such that 
[E= &n| < 4. (26) 
Since |£| = 1, it follows that 
llat > 3. (26') 
Since €(z,) = 0, it follows from (26), (24) that 
5 > IE Ln) End] = len Endl > zeal. 
This contradicts (26’), and shows that no £ satisfying (25) exists. It proves, by theo- 
rem 8, that finite linear combinations of the zp are dense in Z. But then finite linear 


combinations of the z„ with rational coefficients also are dense in Z; since these are 
denumerable, Z is separable. g 


Theorem 13 furnishes another proof of theorem 12. For C[—1, 1] is separable: 
every continuous function can be approximated by piecewise linear functions with 
rational nodes and rational ordinates. On the other hand, C’ is not separable; the 
linear functionals ĉs defined by 


é(fy=fs), —-l<s<l, 
are clearly each bounded by 1, and equally clearly 


lls —€:|= 2 fors At. 


Since the {£s} form a nondenumerable collection, C’ cannot contain a dense denu- 


merable subset. It follows now that C” Æ C. If it were, we could-apply theorem 
13 to Z = C’ and conclude that since C” = C is separable, so is C’, but C’ is not 
separable. ‘a 


The conclusion of theorem 12 is applicable to the space C(Q), Q any Hausdorff 
space containing more than a discrete set of points. This is the precise state of affairs: 


82 DUALS OF NORMED LINEAR SPACES 


Theorem 14. Let Q be a compact Hausdorff space, C(Q) the space of continuous 
_ real-valued functions on Q, normed by the max norm. 


(i) C' consists of all signed measures m of finite total mass, defined over all Borel 
sets. That is, every bounded linear functional £ on C(Q) can be written as 


eterna a ira nis - &f)= [ fdm. 7) 


The norm of £ is 
w= f |dm|. (28) 
YQ 


The measure m is uniquely determined by £. 
(ii) C” is L®(Q), the space of all bounded, Borel-measurable functions on Q. 


The prototype of this basic result is due to F. Riesz; the general result is due to 
Kakutani. A functional analytic proof for Q metric is supplied in Appendix A. 


NOTE. Theorem 14 is emphatically false when Q is not compact and C (Q) is the 
space of all bounded continuous functions on Q, normed by the sup norm. Here is 
what happens: : 

Take Q to be the real line R, {t4} a sequence of points —> oo. We take Y to be the 
subspace of C (R) consisting of all functions f for which f 


lim f (tk) = foo 
exists. For f in Y we define the functional £ by 
£(f) = foo. 
Clearly, £ is linear and £ is bounded on Y: |€|y < 1. By the Hahn-Banach theorem, 
£ can be extended to all of C(R) as a bounded linear functional. 


We claim that this £ cannot be of the form (27). If it were, the value of £( f) would 
depend on values of f on any compact interval J on which 


f iam #0. 
1 


But clearly, we can alter the values of f in Y on J without changing the value of 
£( f); so there cannot be such a dependence. 


The following result is of some interest: 


Theorem 15. A closed linear subspace Y of a reflexive Banach space X is reflexive. 


- SUPPORT FUNCTION OF A SET 83 


Proof. Every bounded linear functional £ on X, when restricted to Y, becomes 
a bounded linear functional on Y; we denote this functional by 29. Since by Hahn- 
Banach every bounded linear functional on Y can be extended to X, this restriction 
map £ —> ép, 
X — Y 


maps X’ onto Y’. The restriction map induces the following mapping from Y” to X”: 
For any 7 in Y” we define ¢ in X” by setting, for any £ in X’, 


$(€) = n(£o), (29) 


where £ọ is the restriction of £ to Y. Since X is reflexive, ¢ can be identified with an 
element z of X: 


(£) = (z); 
setting this into (29) gives 
e(z) = n (£o). (29°) 


We claim that z belongs to Y. To show this, we note that if £ belongs to yt, meaning 
it vanishes on Y, then £o = 0, and so by (29), £(z) = 0. We appeal now to theorem 
8 to conclude that z belongs to the closure of Y. But since Y is closed, z belongs to 
Y. So we can rewrite (29’) as 


tolz) = n(£o). (30) 
Since every functional in Y’ occurs as £9, (30) shows that every 7 in Y” can be 
identified with some z in Y. o 
8.4. SUPPORT FUNCTION OF A SÈT 


We recall from chapter 1 the notion of the convex hull of a pointset M in a linear 
space X over the reals as the smallest convex set in X containing M, that is, the 


_____intersection_of_all_convex_sets_that_contain M. The convex-hull.of M_is denoted 


_ by, My. 


As remarked in theorem 6 of chapter 1, M consists of all convex combinations of 
points of M. These are points of the form 


ee Yo ary, xj in M (31) 
F 


aj20 > a=. (31) 


84 DUALS OF NORMED LINEAR SPACES 


Definition. The closed convex hull of a subset M of a normed linear space X is the - 
smallest closed convex set containing M, that is the intersection of all closed convex 
sets containing M. We denote this set as M. 


Exercise 6. Show that the closed convex hull of M is the closure of the convex hull 
-of M. R E N, i a BP see EEDA A 


the support function Sm as the following function on X E 


Sm(£) = sup £(y). (32) 
; yin M 


Theorem 16. Support functions have the folowing properties: 


_ (i) Subadditivity, for all £, m in X', Sy(€ +m) < Sm (®) + Sm 0n). 
(ii) Sm(0) = 0. 

(iii) Positive homogeneity, Sm (a£) = a Sm (£) fora > 0. 

(iv) Monotonicity, for M C N, Sy(£) < SN (8). 
(v) Additivity, Sy4n = SM + SN. 

(vi) Sw (£) = Su (—£). 

a E) Siz = Sy. 
(viii) Sg = Sm- 


Exercise 7. Prove theorem 16. 


We give now some examples. 


(a) M consists of ; single point xo, 
Sao} (2) = (20). 
(b) Mis the ball Br of radius R around 0: {|x| < R}, 
Spal) = RIL. 


(c) M is the ball Br(xo) : {|x — xol] < R}, using examples (a) and (b), and part (v) 
of theorem 16, we get 


SBp(xo) (£) = £(x0) + RIE. (33) 


Theorem 17. X is a normed linear space over R, M a bounded subset of X. A point 
z of X belongs to the closed, convex hull M of M iff for all £ in X", 


£(z) < Sm (£). (34) 


SUPPORT FUNCTION OF A SET 85 


Proof, By definition (32) of support function, for all 2 in X’ and any < in M. 
£(z) < Sọ (£). By parts (vii) and (viii) of theorem 16, Sy, = Sm, SO that (34) is 


satisfied for all z in M. 

Conversely, suppose that z does not belong to M. Since M is closed, some open 
ball Br(z) centered at z does not intersect M. By the extended hyperplane separation 
theorem, theorem 6 of chapter 3, there is a nonzero linear functional fy and a real 
number c such that 


lolu) < c < £o) (35) 


for all u in M, all v in Bp(z). It follows from the right half of inequality (35) that Co 
is a bounded linear functional. l oe 
The points v of Br(z) are of the form v = z + Rx, |x| < 1. By the right half of 
inequality (35), 
c < lolz) + REo(x). 


It follows from the definition of the norm of a linear functional that 


inf £9(x) = —leol; 
[xl<1 


it follows from the inequality above that 
c < lolz) — Rl eo. (36) 
From the left half of inequality (35) and the definition (32) of Sm, we conclude that 
Sm (lo) < c- 86^ 
Combining (36) and (36’) gives 
Su (lo) + RI£ol < toC). , “GY 


-Since-p # 0; |Zo] > 0; thus (37) shows that if z does not belong to M, (34) fails for 
some ĉo, as asserted in theorem 17. o 


Theorem 18. K denotes a closed, convex subset ofa oe bie spies X, za KI 
OFF Cnorin K. Then ee eee ee aes ea ee era 


inf |z—u| = sup [€(z) — Sk (I. (38) 
u in K jej=1 


Proof. By definition (32) of support function, 


Sx (£) = E(u) for all €, allu e K. 


So for |2| = 1, 


86 : DUALS OF NORMED LINEAR SPACES 


SKO = Lu) = LFL — 2) > YE [w — 2], 


“which is the same as |u — z| > &(z) — Sx (£). It follows from this that 


inf [|u — z| > sup [£(z) — Sx (é)]. (39) 
u mA 7 Ael=1 


_...-10 show. the opposite. inequality, let R be any positive number less than the inf on the 


left in (38). Denote by Bp the ball of radius R around the origin; then the set K + Bp 


has a positive distance from z. So it follows from theorem 17, with K + BR in place 
of M, that for some £ọ in X’, 


- SK+Bpr(£0) < £o(2). (40) 
We use now additivity and example (b): ; 
SK+Bpr (£0) = Sx (£o) + Rl£ol. 
Since we may choose £ọ to have norm = 1, it follows from (40) and (40’) that 
R < £0(z) — Sx (£0) 


for some £9 with |£9| = 1; it follows from this that the sup on the right of (38) is 
> R. Since R can be any number less than the inf on the left of (38), it follows that 


inf Ju — z| < sup [€(z) — Sx (¢)]. (39) 
wink [e|=1 
Combined with (39) this proves (38). o 


Theorem 18 presents another example of dual variational problems. Theorem 7 
is a special case of theorem 18, if we extend the definition of support function to a 
linear space Y: 


0 ifginyt 
Sy(@) = 
a A if £ notin Y+. 


BIBLIOGRAPHY 


Day, M. M. Normed linear spaces. Ergebnisse der Math. und ihrer Grenzgebiete, 21, 1962. 


Kakutani, S. Concrete representation of abstract (M)-s spaces. (A characterization of the space of contin- 
uous functions). An. Math., 42 (1941): 994-1024. 


Milman, D. P. On some criteria for the regularity of spaces of type (B). Dokl. Akad. Nauk SSSR (N.S.), 20 
(1938): 234. 


Riesz, F. Sur les opérations fonctionnelles linéairés. C.R. Acad. Sci. Paris, 149 (1909): 615-619. 


9 


APPLICATIONS OF DUALITY 


9.1 COMPLETENESS OF WEIGHTED POWERS 


Let w(t) be a given positive function defined on R that A exponentially as 
|t| > œ: 


0 < w(t) <ae"l, 


c>0. (1) 
Denote by C the set of continuous functions on R that vanish at oo: 


lim x(t) =0. 
eee x(t) (2) 
C is a Banach space under the maximum norm. 


Theorem 1. The functions t" w(t) belong to C; their closed linear span is all of C. 
That is, every function in C can be approximated uniformly on R hy weighted poly- 
nomials. i 


Proof. We will use theorem 8 of chapter 8. Let £ be any bounded linear functional 
over C that vanishes on the functions t” w 


ew) =0, n=O (3) 


Let ¢ be a complex variable, |Im ¢| < c. Then w(tye!s! belongs to C, and s so 


ee ee ft ett ay 


is defined in the strip |Im ¢| < c. We claim that f( z) is analytic there. For the 
complex difference quotients of weit! tendto iwt e!*inthe nornrof C; and so 
eite L piti 


o= Iim ED LO = pefu 


jim = eliwi eitt), 


87 


88 APPLICATIONS OF DUALITY 


Similarly for the higher derivatives; in particular, using (3), 


qd" 
ce =i"e(wt")=0, n=0,1,.... 
dý ¢=0 
Since f is analytic, the vanishing of all its derivatives at ¢ = 0 means that f (¢) = 0 
-= sssanthe-stripeinparticularpenn t ee f 


J) =LweS) =0 . for all. real. 


By theorem 8, chapter 8, it follows that all functions w e/$ belong to the closed 
linear span of tw. wi. fee ae 

According to the Weierstrass approximation theorem, every continuous periodic 
function A(t) is the uniform limit of trigonometric polynomials. It follows that wh 
belongs to the closed linear span of the functions w e!$! , ¢ real, hence of the functions 
tw. Lety be any continuous function of compact support; define x by 


kai, (5) 
w 
Denote by h a 2p periodic function such that 
x(t) =h(t) __ for |r| < p, (5’) 
p chosen so large that the support of x is contained in the interval it] < p. Then 
[x — himax < elmas 
and so, by (5), (5°), and (1), 
ly — wh| < ae |x|max. 
This shaw that as p H Soh > : Since wh belongs to the closed linear span 


of the functions t” w, so does y. The functions y of compact support are dense in C, 
and the proof is complete. ` 0 


9.2 THE MUNTZ APPROXIMATION THEOREM 


According to the Weierstrass approximation theorem, any continuous function x (t) 
on the interval [0, 1] can be approximated uniformly by polynomials in r. Let n be 
any integer. Clearly, if x(t) is continuous on (0, 1], so is 


y(s) = x(s!/"), 


Now y(s) can be approximated arbitrarily closely in the maximum norm by polyno- 
mials p(s). Setting s = t", we conclude that X(t) can be approximated arbitrarily 


THE MUNTZ APPROXIMATION THEOREM 89 


closely by linear combination of r/", j = 0, 1,.... Thus in the Weierstrass approxi- 
mation theorem not all powers of ż are needed. 

Serge Bernstein posed the following question: What sequences of positive num- 
bers {À j} tending to co have the property that the closed linear span of the functions 


1, u), j=1,2,... (6) 


is the space C of all continuous functions [0, 1]? After some preliminary results were 
obtained by Bernstein, Müntz proved the following: 


Theorem 2. Let Àj be a sequence of positive numbers tending to oo. The functions 
6) span the space of all continuous functions C on [Q, 1] that vanish att = 0i 
P P 


1 
— =. (7) 
ie 
Proof. We will use the spanning criterion, theorem 8, chapter 8. Let £ be a 
bounded linear functional on C that vanishes on all the functions (6): 
(NSO j=l, 2.. (8) 


Let ¢ be a complex variable, Re ¢ > 0. For such ¢, t? belongs to C and depends 
analytically on ¢, in the sense that 


exists in the sense of the norm in C, the maximum norm. Define 


FE) = Lë). (9) 


It follows that f is an analytic function of ¢. Furthermore, since |t?| < 1 when 
O0<t<1andRe¢ > 0, and since £ is bounded, say |£| < 1, it follows from (9) that 


|\f(Q)|< 1° forRe¢ >0. (10) 


Relation (8) can be expressed as 


faj) =0. D Gy 


We define a Blaschke product By (¢) as follows: 
Aj 
Ba) =| |. (12) 
lI G+); 
It has the following properties: 


ByQj) =0, j=l.. N. (13a) 


90 APPLICATIONS OF DUALITY 


By) #0 forg #Aj;. a (13b) 
eee Weyl a ae Re O (13c) 
IBN()I—>1 as |g] > 00. (13d) 


‘Since the zeros of By (¢) are shared by f(f).000 200 


ttn EE E mundi ARDS soon tee, eee ere aes E a cance oe oer ema ee Pa re) Earra ar An = 
= 14 
gn) = Bate 4) 


is regular analytic in Re ¢ > 0. We claim that 


me GN ST for Reg S007 7 (15) 


For combining (10) and (13c), (13d) we conclude that for any € > 0, |gvy(¢)| < I+e 
for Re ¢ = ô, and for |¢| = 871, ô small enough. By the maximum principle for the 
analytic function gy on the'domain Re ¢ > ô, |¢| < 67!, lgn()| < 1 +e there. 
Letting 6,€ —> 0, we obtain (15). Let k be a positive number such that f(k) 4 0; 
from (14) and (15) we conclude that 


l! 


We can write the factors on the left in (16) as 


àj+k 
Aj —k 


1 
16 
ik FO oa 


2k 


1 
T 


Since àj —> oo, all but a finite number of the factors above are > 1. So from the 
uniform boundedness of the product (16) for all N, we conclude the uniform bound- 
edness, for all N, of the sum 


N 
2 Àj aoe 
This contradicts (7); so we conclude that f (k) = 0 for all k. In view of the definition 
(9) of f, and property (8) of £, this says that any linear functional that vanishes on 
the functions t*/ vanishes on ¢*, k positive. So by theorem 8, chapter 8, the spanning 
criterion, we conclude that all functions r* can be approximated uniformly by linear 
combination of the functions {t*/}. Taking in particular k = 1,2,3,... and appealing 
to the Weierstrass approximation theorem, we conclude that the functions (6) span C. 
We omit the proof of the necessity of condition (7). 


NOTE. Szász has extended Miintz’s theorem to complex A je 


Exercise 1. Formulate and prove the extension of theorem 2 to complex exponents. 


DUAL VARIATIONAL PROBLEMS IN FUNCTION THEORY 91 
9.3 RUNGE’S THEOREM 


Theorem 3. Let D be a bounded simply connected domain in C. Every analytic 
function f (¢) in D can be approximated, uniformly on compact subsets K, by poly- 
nomials in €. 


Proof. Since D is simply connected, every compact subset of D is contained in a 
simply connected compact subset K of D. Choose a closed smooth curve in D — K 
that winds once around every point of K, and express f(¢), ¢ in K, by Cauchy’s 
integral formula. This integral can be approximated, uniformly for all points ¢ of K, 
by a sum. This sum is a linear combination of functions of the form (x — ¢)~ x 
on the curve. Therefore to prove the theorem, it suffices to show that all functions of 
¢ of form (x — ¢)—!, x not in K, can be approximated on K by polynomials in ¢. 
This is clear when |x| > R, R = max |¢|, ¢ in K. Then the geometric series 


0° g" 


GADA T 


converges uniformly on K. To show it for all ¢, we will use the spanning criterion. 
Let £ be any bounded linear functional on C(K) that vanishes on all polynomials: 


£(p) = 0. 
We claim that £ vanishes on all functions of the form (x —¢)7!, x not in K. Define 
g(x) = (x —g)7'). 


Since (x — gL as element of C(K), depends analytically on x, it follows pa 
g(x) is an analytic function of x in the exterior of K. Since for |x| > R, (x — by! 
belongs to the closure of polynomials p, and since €(p) = 0, it follows by continuity 
that €((x — t) !) = 0 for such x, and so 


g(x)=0 | for|R| < Ixl. 


Since g is analytic in the exterior, and since the exterior of a simply connected set 
K is connected, it follows that g(x) = 0 for all x not in K. Then, by the spanning 
criterion, theorem 8 of chapter 8, for all x outside K, (x — -$ )—! is in the closure of 
the-space-of-polynomials.. Ge ee hate be 


ie ae 


This beautiful proof is due to Lars Hérmander. 


9.4 DUAL VARIATIONAL PROBLEMS IN FUNCTION THEORY 


Theorem 4. Let D be a bounded domain in C, whose boundary consists of a finite 
number of C! arcs. Denote by A the space of functions analytic in D and continuous 


(eA 
92 APPLICATIONS OF DUALITY 


up to the boundary, and by to any point in D: Define 
i aT ein atin Si EET) 


Denote by ug the function 


p: 1 <1 oi cate 
oa ON-8-D,-and-define 
m= inf i, luo(t) — g(¢)I ae. (19) 


“We claim that m = M, and that the supremum M in (17) is attained. 


Proof. We show first that M < m. Using the Cauchy integral formula, and the 
Cauchy integral theorem, we can represent f'(čo) as follows: - me aa 


filo) = f! fu dt = f Fluo —g) de, (20) 
aD aD 


where g is any function in A. Since f in (17) is < 1 in absolute value, we deduce 
from (20) that i 


Iols f two ~ glati. 
aD 
Choose g so that it nearly minimizes (19); we get 


IF COl < m +e. 


In view of (17) this implies M < m + €, and since € > Ois arbitrary, M < m. 

To prove the converse inequality, we look at the space C of continuous functions 
on dD; ug belongs to C, and A is a linear subspace of C. We impose the L)-norm 
| |; on C, with respect to arclength |d¢| along 3 D. The infimum (19) can be written 
as 


= j f _ P 
m = ni eg gli 
According to theorem 7 of chapter 8, 


m = max |£(ug)|, £=0OonA. (21) 
\e{s1 


We denote by £9 an £ that maximizes (21). We define the function fo for x not on 
aD by 


1 1 
fox) = zo|) (22) 


DUAL VARIATIONAL PROBLEMS IN FUNCTION THEORY 93 
We claim that fo has the following properties: 


(i) fo(x) is analytic for x not on dD. 
(ii) fo(x) = 0 for x notin D. 
(ii) [fo(x)| <1. 
(iv) flo) =m. 


Property (i) follows from the fact that 1/(¢ — x) as an element of C depends analyt- 
ically on x. Property (ii) follows since for x not in D, 1/(¢ — x) lies in A, and £o 
vanishes on A. To prove (iii), choose x near 9D, and take x’ to be the reflection of 
x across D. x’ lies outside D. We will use (ii) to write 


aa EEA 
Px) = falx) = fox) = 5 £0 (; =A Le =a) 
Since |Zo] < 1, 
peci 1 1 l= x 
ee E a Se nel fe 
IAGO S Fle =| Se ° 


A simple estimate shows that the integral on the right is < 1 + for x near the 
boundary. So | fo(x)| < 1 + for x near the boundary; from this (iii) follows by the 
maximum principle. 

Differentiate (22) and set ¢ = fo; using (18) and (21), we get 


folto) = toluo) = m. (23) 


Since we have already shown that | f’(fo)| < m for |f| < 1 in D, it follows that fo 
solves the maximum problem (17), and that M = m. o 


Suppose that D is simply connected. We claim that fo, the function maximizing 
(17), maps D conformally onto the unit disc. Here is a sketch of a proof: 


d 
m= / fo(uo — so) 5 ds, 
aD s 


---—-Where.s-is.arclength-Using-the.fact that-|.fo(¢)|-< 1, and that, by (19), 


d 
m =f luo — gol Š ds, 
aD ds 
we conclude that | fo(¢)| = 1 on dD and that 
: dt 
Jfoluo — 8o); >0. onðD. (24) 


is 


94 ` APPLICATIONS OF DUALITY 


Denote by 27x [h] the change of argument of a nonzero complex valued function h 
~= e around-ð D. From (24) we deduce that 


d 
EITAN E ka a (25) 
Lio hn ee M S 


-According to the argument principle, for the boundary values Aof functions mero- 
morphic in D, 


[h] = # zeros — # poles in D. 


„But fo and gg have no poles, and ug has a single pole of order 2. For D simply 
~ connected, [d¢/ds] = 1; so we deduce from (25) that 


# zeros of fo + # zeros of (uo — g9) -2+1=0. (26) 


It follows from (26) that fọ has at most one zero in D. We claim that it has at 
least one zero; otherwise, iy would be analytic in D. Since | fọ] = 1 on dD, it 
would follow from the maximum principle that | f9(¢)| = 1 in D, which implies 
that fo = const. , contrary to f’(¢g) = m. Combining the two statements above, we 
conclude that fp has exactly one zero in D. 

According to the argument principle, [fo] equals 1. Since | fo(£)| = 1 for ¢ on 
dD, [fo — w] = 1 for all w inside the unit disc; it follows that f(¢) takes on every 
value w exactly once in D. This shows that fp maps D one-to-one onto the open unit 
disc. 

The arguments used in this section combine methods introduced by Rogosinski 
and Shapiro with results of Garabedian and Schiffer. 


9.5 EXISTENCE OF GREEN’S FUNCTION 


Definition. Let D be a plane domain whose boundary B is once continuously dif- 
ferentiable. Green’s function G(p, q) of the domain D is defined for p,q in D by 
the requirements that 


(i) ApG = (p — q), where A, is the Laplace operator with respect to p, and 6 
is the Dirac distribution, see Appendix B. 


(ii) G(p, q) = Ofor pon B. 


In this definition, the variable p and q play unsymmetric roles, q appearing merely 
as a parameter in a boundary value problem. 

The significance of Green’s function is that it can be used to represent every har- 
monic function / in D in terms of the boundary values of h by using Green’s formula: 


h(q) = f h(p) Ga(p. 4) dp, 
. dD 


EXISTENCE OF GREEN’S FUNCTION “ss 95 


where Gn(p, q) is the derivative of G with respect to p in the direction normal to 
the boundary B, and dp is arclength. Furthermore, for D simply connected, G is the 
logarithm of the absolute value of the analytic function f mapping D onto the unit 
disk, carrying q into the origin. 

Green’s function can be split into its singular and regular part: 


l 
G(p, q) = =z log|p — 41 + sole, q): (27) 


The function go is called the regular part of Green’s function. Referring to (i) and 
(ii) above, we can characterize go as the solution of the following boundary value 
problem: 


Ap go =9 in D, (28) 
go(p.q) =loglp—q| for pon B, (29) 


where log r is an abbreviation for (1/277) logr. 


Clearly, the definition of Green’s function rests on the fact that the boundary value ` 


problem (28), (29) can be solved. Classically this is deduced from the solvability 
of the Dirichlet boundary value problem for A with arbitrarily prescribed boundary 
values. Here we show how to solve (28), (29) without appealing to the general theory. 
We denote by C the space of continuous functions on the boundary B normed by 
the max norm. We denote by H the subspace consisting of the boundary values 
of functions A that are harmonic in D and continuous up to the boundary. In what 
follows the point q is fixed once and for all in D. We define the functional £g for 
functions h in H by setting 


£q(h) = h(a), (30) 


__ where h(q) denotes the value at q of the harmonic function whose value on B is 
denoted by h. It is well known in the theory of harmonic functions that h at q is 
uniquely determined by A on the boundary B, and that the maximum principle holds: 


hg) S maxi) = We 


This inequality can be expressed so: as defined on H, the norm of @, is < 1. It 
follows then from the Hahn-Banach theorem that £,.can be extended from H to all 


ei omer e (GT) 


We denote by w any point of the plane not on the boundary B of D, and define the 
element k(w) of C by 


aa ore Spector atest sateen “———k(prwy = logtp =wh p-e Bs = ae (32) 


Two observations on the manner of dependence of k on the parameter w: 


aye 


96 APPLICATIONS OF DUALITY 


(i) k(w) is a differentiable function of w, and satisfies 


We define now the function g(w, q) by Sosa 
8(w, q) = £4 (k(w)). (34) 
- where k is defined by (32). eee 


Lemma 5. 


of B. 
(ii) For w' in the exterior of D, 


g(w',g) =loglg —w'|. i (35) 


Proof. Since £g is linear, by (34), 


i (ete =) g(w +du,q) — g(w, q) 
a d as as 


We let d tend to 0. Since £q is bounded, we deduce that 
lg wk) = dwe(w, q). 
Applying this to second derivatives and using (33), we get 


Awg = £q(Awk) = 0, 


as asserted in part (i). 


For w’ in the exterior of D, k(w’) belongs to H. Applying the original defintion 
(30) of £g in (34) yields (35). o 


Lemma 6. g(w, q) depends continuously on w as w crosses the boundary. 


To see this, let w be a point in D close to the boundary, and w’ the reflection of w 
across the boundary. Reflection w’ is obtained by drawing a straight line from w to 
the nearest boundary point po, and choosing w’ so that 


w +w 
5 = Po: 


EXISTENCE OF GREEN’S FUNCTION 97 


By definition (34) and since £g is linear, 


g(w, q) — g(w', q) = lglk(w) — k(w")) = Lg (ise ik ) -80 


lp—w’| 
It is easy to verify that since the boundary B has a continuously turning tangent 


|p- w 


(37) 
|p- w'| 
as w tends to the boundary B, uniformly for all points p in B. It follows that 
-w 
ioe 2 ; (38) 
|p — w"| 


uniformly for all p on B, and thus in the maximum norm. Since £q is a bounded 
linear functional, it follows that the right side of (36) - 0 as w approaches the 
boundary. 

For w in D and near the boundary, w’ lies outside D. According to (35), g(w’, q) 
tends to log|g — p| as w’ approaches p on the boundary. This proves lemma 6, and 
shows that as w in D tends to p in B, 


lim g(w, q) = logl — pl. (39) 


According to part (i) of lemma 5 we see that g(w, q) is a harmonic function in D. 
According to (39) its boundary values are log|g — p|. These two facts characterize 
the regular part gq of Green’s function and therefore 


g(w, q) = go(w. q). 


This concludes the proof of the existence of Green’s function. o 


REMARK 1. The argument above shows that k(w) defined by (32) is the boundary 
value of the harmonic function g(p, w), and therefore belongs to H even when w 
lies in D. Therefore the original definition (30) of £, applies: l 


a ESN EER GBI 


By (34), it follows that 


en A, _ 40) 


This shows that the regular part of Green’s function depends symmetrically on its 
two arguments. But then, by (27), so does Green’s function itself. 


98 : APPLICATIONS OF DUALITY 


REMARK 2. In case of a boundary curve B that is twice differentiable, relation (37) 
can be sharpened to 


PT] =14+ 00d), 
p-w 
where d = |w — p| is the distance of w to the boundary. Using this, we can sharpen 
(38): i 
—|p—w 
Tog F | = 0(d). 


We set this into (36). Since by (35), g(w’, q) = log|q — w|, it differs from g(p, q) = 
loglg — p| by O(d). So we conclude that as w in D tends to the nearest boundary 
point p, . ; 


|\s(w, g)—a(p. q)| < O(d). | (41) 


We claim that the first derivatives of g are uniformly bounded in D up to the bound- 
ary. This is because we can express the first derivatives of the harmonic function g at 
w as integrals over the circle of radius d centered at w: 


1 é 
Qn grad g(w) = 5 / [g(w + de(@)) — g(p) je(6) d0, (42) 


where e(8) = (cos@, sin @). It follows from (41) that the integrand in (42) is O(d); 
from this the uniform boundedness of grad g follows. 


We know from the Cauchy-Riemann equations that the conjugate harmonic func- 
tion to g also has uniformly bounded first derivatives in D. This shows that for D 
simply connected, the analytic function mapping D onto a disk is uniformly Lips- 
chitz continuous. 


BIBLIOGRAPHY 


Garabedian, P. R. and Schiffer, M. On existence theorems of potential theory and conformal mapping. An. 
Math., 52 (1950): 164-187. 


Garabedian, P. R. and Shiffman, M. On solutions of partial differential equations by the Hahn-Banach 
theorem. Trans. AMS, 76 (1954): 288-299. 


Lax, P. D. On the existence of Green’s function. Proc. AMS, 3 (1952): 526-531. 
Lax, P. D. Reciprocal extremal problems in function theory. CPAM, 8 (1955): 437-454. 


Miintz, Ch. H. Uber die Approximationssatz von Weierstrass. Math. Abhandlungen H. A. Schwarz gewid- 
met, Berlin (1914): 303-312. 

Rogosinski, W. W. and Shapiro, H. S. On certain extremum problems for analytic functions. Acta Math., 
84 (1953): 287-318. 


Szász, O. Uber die Approximation stetiger Funktionen durch lineare Aggregate von Potenzen. Math. An., 
77 (1915-1916): 482—496. 


10 


WEAK CONVERGENCE 


Definition. A sequence {xn} in a normed linear space X ‘is said to converge weakly 


to x if 
iim, €(xn) = L(x) 
for every £ in X’. This relation is indicated by a half arrow: 
Xn—x. 
Another notation is 


w — lim Xn =X. 
n> OO 


‘The notion is to be contrasted with converge in sense of the norm: 


lim —y| =0. 

joe [Yn — yl 

In this case we say that {yn} tends strongly to y, and denote it as 
Yn > Y. 

Another suggestive notation for strong cònvergence is 


s— lim y= y. 
n—> CO 


Clearly, a sequence that converges to x strongly also converges weakly to x, but_ 


(1) 


(2) 


(2) 


in general not vice versa. Here are some examples: _ 


Example 1. X = 7; its points re vectors ~~~ ~~ 
r= (a1, a2, AA 


with denumerably many components, such that 


Ixl? = $ laj? < oo. 


g 


100 WEAK CONVERGENCE 


Since £% is a Hilbert space, according to theorem 4 of chapter 6 all bounded linear 
functionals on £? are of the form — 


&(x) = (x,y) = Do ajbj, >- [byl? < 00. (4) 


--Define-x,-as-the nth-unit vector, that is, the vector whose nth component is 1, all 


„Others zero: x, = (0, ..., 0,1,0, ...). It is easy to show, and left as an exercise, that 


the sequence {xp} tends weakly to zero, but not strongly. 


Example 2. H any Hilbert space, {x,} an orthonormal sequence; such a sequence 
tends weakly, but not strongly, to zero. 


Proof It follows fer Bessel’s inequality (31) of chapter 6, that for any y in H 
a Sol. DË < IIe, 5) 
from which it follows that 
£(in) = Gn y) > 0. (5’) 
Since according to the Riesz representation theorem, theorem 4 in chapter 6, all linear 


functionals are of the form (5’), weak convergence to zero follows. Since ||xn |] = 1 
for all n, {xn} does not tend to zero-strongly. 


Example 3. X = C[0, 1]. 


nt for O<rt< 
Xn(t) = {2—nt for i 
0 for 2<1r< 
CLAIM. xp tends to zero weakly but not strongly. 
Proof. Let £ be a bounded linear functional; we claim that lim £(x,) = 0. 
Now suppose not; then there would be infinitely many n such that |€(x,)| > 6 > 0, 
say 


L(xn) > 8. (6) 


Choose a subsequence {ng}, 144.1 > 2ng, for which (6) holds. It is not hard to show 
that for all t in [0,1] 


K 
YK = x(t) <4, (7) 
] 


UNIFORM BOUNDEDNESS OF WEAKLY CONVERGENT SEQUENCES 101 


which implies that |yx| < 4 for all K. From (6) it follows that 


K 
tOr) =) en) > KE. 
l 


Since this holds for all K, and since |yx| < 4 for all K, the boundedness of £ is 
contradicted. Since |x„| = max xn (t) = 1, {xn} does not tend to zero strongly. | 


Exercise 1. Prove inequality (7). (Draw a picture of the graph of x,(t)). 
Example 4. X = £', consisting of vectors x = (a), a2,...), |x| = )- Jak] < œ. As 
we saw in chapter 8 (exercise after theorem 11), the dual of £! is €°°, The following 


observation is due to Schur: 


If a sequence {xp} in £! converges weakly, it converges strongly. 


? i fERKEZ KÜTÜPHANESİ 
Exercise 2. Prove the preceding statement. LTU MERI EZ i 1j TURI I IES 
Deminas a wants AAS sie 

15 NO tc cceveeeenreenens PERAR 


10.1 UNIFORM BOUNDEDNESS OF WEAKLY“ "i 
CONVERGENT SEQUENCES 


The following result is useful in proving weak convergence: 


Theorem 1. Suppose that a sequence {xn} of points in a normed linear space satis- 


fies 
(i) {xnl} are uniformly bounded: 
Ixn| < c. 


(ii) lim £(xn) = €(x) for a set of £ dense in X’. Then 


w —limx, = x. 


Exercise 3. Prove theorem 1. 


~ Surprisingly, the EA EAE E E T E E 


S to the principle of uniform boundedness for a-complete- metric space Srif ammm ===- 
collection {fy} of continuous real-valued functions. fo on Si is bounded at each point 
x of S, 


IPIS M(x) forall v, (8) 
then the functions fy are uniformly bounded, 


Ifo] < M, (8%) 


D 


102 WEAK CONVERGENCE 


for all u in some nonempty open set O. We specialize this to the case where S is a 
Banach space X and each fy is subadditive and absolutely homogeneous: 


Faty s ffo  fax)=]a fo). — (9) 


Theorem 2. Let X be a Banach space, { fy} a collection of real-valued continuous 
subadditive and absolutely homogeneous functions on X, bounded at-each-point x-of====== 


. -X as in.(8). Then the. {fv }-are-uniformly-bounded,-that-is,-there-is-a_-nunbere-such_—_—. 
that 


[fv@)| < el (10) 


forall fy and all x in X. 


Proof. By the principle of uniform boundedness for metric spaces, | f,(u)| < M 


for all fp and all uw in some open ball u = z+ y, |y| < r. Using subadditivity, we 
have that 


IAOU = [folu = 21s AO] + fa < 2M (11) 


holds for all y with |y] = r/2. For any x in X define y by y = rx/2|x]. By con- 
struction, |y| = r/2, so (11) holds. Using absolute homogeneity we get from (11) 


that 
2 2 4M 
fo (=) = lrg = oe 


r 


fœ) = 


this proves (10), with c = 4M/r. o 
An immediate consequence of theorem 2 is 


Theorem 3. X is a Banach space; {Ly} a collection of bounded linear functionals 
such that at every point x of X 


lla) < M(x) forall Ly. (12) 


Then there is a constant c such that 
lll sce forall £,,. (13) 


Proof. \€(x)| is a continuous subadditive and absolutely homogeneous function of 
x. Therefore theorem 2 is applicable; its conclusion (10) yields (13). 0 


Another immediate consequence is 


Theorem 4. X is a normed linear space, {xy} a collection of points in X such that 
for every bounded linear functional £ 


UNIFORM BOUNDEDNESS OF WEAKLY CONVERGENT SEQUENCES 103 
l€(xy)| < M(e) forall xy. (12^) 

Then there is a constant c such that 
Ixy] <c  forallxy. (13’) 


Proof. Theorem 4 follows from theorem 3 applied to the Banach space X’, on 
which the elements x, of X act as bounded linear functionals. 0 


An immediate consequence of theorem 4 is 


Theorem 4’. A weakly convergent sequence {xn} in a normed linear space X is 
uniformly bounded in norm. 


Proof. Weak convergence means that €(x,) is convergent for every £ in X’. Since 
a convergent sequence of numbers is bounded, hypothesis (12^) of theorem 4 is sat- 
isfied; therefore (13’) holds. g 


Theorems 2, 3, and 4 are called the principle of uniform boundedness. 


Theorem 5. Let {xn} be a sequence in a normed linear space converging weakly 
to x. Then 


|x| < lim inf [x_]. (14) 
Proof. According to theorem 6 of chapter 8, there is an £ in X’, such that 
Ix] =o]; l=. 
Since weak convergence means that 
£(x) = iml, 
and since 
Eel < El bal = eal, 


OO UNE a ee Oe 


The following far reaching generalization of theorem 5 is due to Ma: 
Theorem 6. Let K be a closed, convex subset of a normed linear space X, {xn} a 
sequence of points in K, converging weakly to a point x. Then x belongs to K. 


Proof. Let Sx be the support function of K, defined by equation (32) of chapter8 


as SUP; in x £(x). It follows from that definition that for any £ in X’ 


104 WEAK CONVERGENCE 


Ln) < Sx (£). (15) 
Since £(x,) tends to £(x), it follows that also 


t(x) < Sx (£). 


But according to theorem 17 of chapter 8, this guarantees that x belongs to K. D 


- Exercise 4. -Deduce theorem 5 from theorem-6 applied to-balls-centered-at-the origin --—- 


K = Br: {x| |x] < R}. 


10.2 WEAK SEQUENTIAL COMPACTNESS E 


_ Definition. A subset C of a Banach space X is called weakly sequentially compact 


if any sequence of points in C has a subsequence weakly convergent to a point of C.. 
Exercise 5. Show that a weakly sequentially compact set is bounded. 


The importance of weak sequential compactness is the same as that of compact- 
ness in the sense of strong convergence. Weak compactness is a valuable tool in 
constructing, as weak limits, mathematical objects of interest. To wield this tool, we 
need simple, easily verifiable criteria for weak compactness; the following is such a 
criterion: 


Theorem 7. In a reflexive Banach space X the closed unit ball is weakly sequentially 
compact. 


Proof. Let {yn} be a sequence of points in the unit ball, that is, |y,| < 1. Denote 
by Y the closed linear subspace spanned by the set {yn}; Y is separable. Since X is 


„assumed reflexive, it follows from theorem 15 of chapter 8 that Y is reflexive. Since 


Y = Y” is separable as well, it follows from theorem 13 of chapter 8 that Y’ also 
is separable, meaning that it contains a dense, denumerable subset {m ;}. Using the 
classical diagonal process, we can select a subsequence {zn} of {yn} such that 


lim m j(2n) (16) 
n= Oo 
exists for every mj. Since all zp satisfy |zn| < 1, and since the {m j} are dense, 
it follows from (16) and theorem 1 that for all m in Y’, m(z,) tends to a limit as 
n — oo. This limit is a linear functional of m: 


lim m(zn) = y(n). (16') 
N-> OO 


Since |m(zn)| < [m| |zn| < ||, it follows from (16’) that the linear functional y (#2) 
has norm < 1. Since Y is reflexive, there is a y in Y such that y(m) = m(y), |y| < 1, 


WEAK* CONVERGENCE 105 


and so (16) says that for all m in Y’, m(zn) tends to m(y) as n —> oo. Since the 
restriction of any £ in X’ to Y is an m in Y’, this proves that zp, converges weakly to 
a point y in the unit ball. ' o 


Note the sharp contrast between theorem 7 and theorem 6 of chapter 5, according 
to which the unit ball is never compact in the norm topology. Compactness is gained 
by replacing strong with weak convergence. 

Eberlein has proved the converse of theorem 7: 


Theorem 8. The closed unit ball in a Banach space X is weakly sequentially com- 
pact only if X is reflexive. 


Combining theorems 6 and 7 gives the following useful result: 


Theorem 9. Ina reflexive Banach space every bounded, closed, convex set is weakly 
sequentially compact. 


Here is a useful application of theorem 9. 


Theorem 10. Let X be a reflexive Banach space, K a closed, convex subset of X, z 
any point of X. Then there is a point y of K which is as close to z as any other point 
of K. 


Proof. We may take z = 0, and assume that 0 ¢ K. Denote by s the distance of 0 
to K, that is, 


s = inf [yl, yin K. (17) 


Let {yn} be a minimizing sequence for (17). We may assume that each y, lies in 
the intersection of K and that the ball of radius is 2s around the origin. This is a 
bounded, closed, convex set, therefore, by theorem 9, a subsequence {zn} of {yn} 
converges weakly to some point z of K. According to theorem 5, 


kzl < lim inf |z znl. (18) 


Since {z,} is the subsequence of a Tre sequence, lim |z,| = s. Combining 


this with (17) and (18) gives |z| = s; that is, z iş a point of K closestto0. =O. 


~~ Theorem 10 is a generalization of theorem 8 of chapter 5. There we assumed | that 


X is uniformly convex; Here we assume come that X is reflexive. ~~ 


10.3 WEAK* CONVERGENCE 


In a Banach space U that is the dual X’ of another Banach space X, there is a subclass ~ 


of linear functionals associated with elements x of X: 


106 WEAK CONVERGENCE | 


x(u) = u(x)... (19) 


One can define sequential convergence in U with respect-to this subclass of linear 
functionals: 


Definition. A sequence {un} in a Banach space U that is the dual of another Banach 


space X is said to be weak* convergent to u if 


sis forest RE sleds! 3S eee oe snore Ie (x) =u (x) Bah A A EEA AE WG aioa 10) eee 


for all x in X. We denote this relation as 


w* — lim un = u. 


REMARK'`1. Of course, if X is reflexive, weak* convergence is no different than 
weak convergence. 


Example 5. U is the space of all signed Borel measures m on [—1, 1], of finite total 
mass. According to theorem 14 of chapter 8, U is the dual of C[—1, 1]. 


Consider the sequence {mp}: 


n l/n 7 
dy (h) = f rams = zf h(t)dt. (21) 
2 J—lj/n 


Clearly, for any continuous h, 
lim mp(h) = h(0). (22) 
n-> co 
This shows that m, is weak* convergent to the unit mass at the origin. The dual of U 


is L°{—1, 1], the space of all bounded measurable functions. Since (22) is not true 
for some h discontinuous at 0, mp does not converge weakly. 


Theorem 11. A weak* convergent sequence {un} of points in a Banach space U = 
X' is uniformly bounded. 


Proof. Weak* convergence implies that the boundedness condition (12) holds at 
every point x of X. So theorem 3 implies (13), uniform boundedness. oO 


Exercise 6. Show that if the sequence {un} is weak* convergent to u, 
\u| < liminf |u|. 
Definition. A subset C of a Banach space U that is the dual of another Banach 


space X is called weak* sequentially compact if every sequence of points in C has a 
subsequence that is weak* convergent to a point of C. 


BIBLIOGRAPHY 107 


The following important result is due to Helly: 


Theorem 12. Let X be a separable Banach space, U = X'. The closed unit ball in 
U is weak* sequentially compact. 


Proof. Given a sequence {un} in U, 


jun] <1, (23) 


and a denumerable set {x;} in X, we can, by the diagonal process, select a subse- 
quence {vn} of {un} such that 


lim vp (xg) (24) 
N-> CO 


exists for all x4. It follows from (23) and (24) that v,(x) tends to a limit for all x that 
lie in the closure of the set {x4}. So, if we take for {xg} a set dense in X, vn (x) tends 
to a limit for all x in X. It is easy to see that this limit is a linear function of x, and, 
using (23), that it is bounded by 1. g 


BIBLIOGRAPHY 


Eberlein, W. F. Weak compactness in Banach spaces. /. Proc. Nat. Acad. Sci. USA, 33 (1947): 51-53. 


Helly, E. Uber lineare Funkionaloperationen. S.-B. K. Akad. Wiss. Wien Math.—Naturwiss. Kl, 121, 
(1912): 265-297. 


Mazur, S. Uber konvexe Mengen in linearen normierten Räumen. Studia Math., 4 (1933): 70-84. 


g 


1 


~_APPEICATIONS OF 


“WEAK CONVERGENCE — 


11.1 APPROXIMATION OF THE ô FUNCTION 
BY CONTINUOUS FUNCTIONS 


Definition. A sequence {kn} of continuous functions on [—1, 1] tends to the ô func- 
tion if 


l 
A iaj F@)kn(t) dt = f (0) (1) 
for all continuous functions f on [—1, 1]. 


Theorem 1 (Toeplitz). The sequence {kn} of continuous functions on [—1, 1] tends 
to the & function in the sense of (1) if and only if it satisfies the following conditions: 


(i) 


o 
im f karei, (2) 
n- 00 nif 


(ii) For every C® function g whose support does not contain 0, 


l 
sim, f glt) kn(t)dt =0. (3) 


(iii) There is a constant c for which 


l 
/ Men(Dldt <e (4) 


holds for all n. 


108 


DIVERGENCE OF FOURIER SERIES 109 


Proof. Suppose that f (0) = 0; let g be a C™ function that differs from f by less 
than e€ at all points ¢ in [—1, 1], and that is zero in some interval around t = 0. Then 
by (4), 


1 
| -Dra se f Unldt < ce. (5) 


By assumption (3), f 8 kn dt tends to zero, so it follows from (5) that 


[ fina 


Since € is arbitrary, (1) is verified in case f(0) = 0. Every function can be decom- 
posed as b + f, b some constant and f(0) = 0, so (1) follows from (2) for every 
continuous f. 

Now to the converse: condition (2) is clearly necessary, for it is a special case of 
(1) when f(t) = 1; the same goes for (3). 

We can regard {kn} as a sequence in C’[—1, 1], and state (1) as 


lim sup 


SE. (6) 


w* —limky = ô. 
According to theorem 3 of chapter 10, the norms 
lal = f kalat 
must be uniformly bounded. This proves the necessity of (4), and even more: 


Corollary 1’. Jf (4) is violated, there exists a continuous function f for which the 
left side of (1) tends to infinity. g 


11.2 DIVERGENCE OF FOURIER SERIES 
Theorem 2. There exists a periodic continuous function f (8) phosetounér si o 
diverges at one point. 


aai an <a a et Co" es roi 
pose : £0) © So ane, . (7) 
—co 


SEE Gages ——35 en. os 
an = l f0) e™™ do where dO = —. (7^) 
-r 20 


110 APPLICATIONS OF WEAK CONVERGENCE 


The convergence of the series at, say, 9 = 0 means that 


fO= UED (8) 
Using (7'), we can write i Sas per grees Eee We ne pas aa 
um = T —feerewterae, O) 
where 
2xky (8) = L (10) 


Using the formula for the sum of a finite geometric series, we readily get, for @ Æ 0, 


A _ sin(N + 1/2)6 

Thus the convergence of the Fourier series of every continuous function is equiv- 
alent to the sequence {kn} defined by (10) approximating the 6 function. According 
to theorem 1 this is the case iff conditions (2), (3), and (4) are satisfied. We show 
now that condition (4) fails! To see this, we use the inequality | sin@| < ||, which 
implies that |1/(sin@/2)| > 2/|@|. Using (10’) and a little calculus, we get 


x 1 f7 1 2 p(N+i/2) 
Í kwod > f sin (N+ 5 8 a = “na E 
-x HX J—r 2 0 om 


(lox, 
This last integral is, it is easy to show, > const. log N. Thus condition (4) fails, and 
-. so, by Corollary 1’, for some function f, the Fourier series of f- diverges at-6-=-0 to- ~ 
infinity. , o 


Exercise 1. Show that there exists a continuous periodic function whose Fourier se- 
ries diverges at n aribtrarily given points. 


11.3 APPROXIMATE QUADRATURE 


An approximate quadrature formula is an approximation to the integral of a contin- 
uous function f on, say, [—1, 1]. Take N points 1; in [—1, 1], called nodes, and N 
numbers w; called weights; define q (f) by 


N 
af) = wjf). (11) 
l 


WEAK AND STRONG ANALYTICITY OF VECTOR-VALUED FUNCTIONS 111 


We regard q as an approximation to 


| toa. (11%) 


Theorem 3. Let qy be a sequence of quadrature formulas of form (11), satisfying 
the following conditions: 


(i) For every nonnegative integer k, 


| 
li tt) = kdt, 
yim an ( ) J: L (12) 
(ii) Forall N, 
N 
Yo lwj(N) <c, (13) 
1 
c a constant. Then 
l 
li = t)dt 
vim an (P) [ to (14) 


for all continuous f. Conversely, if (14) holds for all continuous f, (12) and 
(13) must be satisfied. 


Proof. It follows from (12) that (14) holds for all polynomials f. Inequality (13) 
asserts that the linear functionals gy on C[—1, 1] have uniformly bounded norms. 
Since the polynomials are dense in C[—1, 1], (14) follows for all continuous f. The 
converse follows from theorem 3 of chapter 10. ae Vee g 


Exercise 2. Prove that if the weights w; are positive, (13) follows from (12). 


11.4 WEAK AND STRONG ANALYTICITY OF 
VECTOR-VALUED FUNCTIONS 


Let f be a function defined in some domain G of the complex è plane, whose values 


lie in some tomples Banach ae DO R ay 


Definition. f) is Goa: ae in G if the limit 
lim 2 th ZIO) 


io, See ee Ses 


exists in the norm topology at every point of G. 


io 
ve 


112 ` APPLICATIONS OF WEAK CONVERGENCE 


Definition. f (¢) is weakly analytic in G if for every bounded linear functional £, 
£(f (¢)) is an analytic. function of ¢ in the classical sense. 


N. Dunford has proved the following surprising result: 


Theorem 4. A weakly analytic function is ane sine = 


= Proof. If K Í &)) is “analytic i in weg ‘we can n represent it “by ‘the ‘Cauchy integral ~ 


formula 


d 
UFE) = [=< a x, where aaa 5) 


c some rectifiable ae ae ¢. Similar fo formulas hold when ; fis sea 
by č + kand č + h, k and h small enough. Assume that k Æ 0, h # 0, h # k; then 
we can express the difference quotient of difference quotients thus: 


1 fans +A- EED EGGE +4) ae 


h—k h k 
ax 

(x-f-AY x -—$-Bx- 8) 

For fixed £, and ||, || small enough, the right side of (16) is bounded by a constant 

M independent of h and k. We can rewrite the left side as £(x4,,) where 


t 1 Uea 
hk = FIK h k i 


Weak analyticity thus implies that for each £ and all } and k sufficiently small, 
lE(xn,k)| < M(£). We appeal now to the principle of uniform boundedness, theo- 
rem 4, chapter 10, and conclude that |xj,,4| < c for all h, k sufficiently small. By 
definition (17) of xn, k this implies the norm inequality 


FE +N —FO) _ fe +H fC) 
h 


z Í UFO) (16) 
Cc 


(17) 


<clh—kl. (18) 


Since X is complete, it follows that the difference quotients of f(¢) tend to a limit 
in the strong sense. o 


11.5 EXISTENCE OF SOLUTIONS OF PARTIAL 
DIFFERENTIAL EQUATIONS 


We denote by L a first-order partial differential operator of the following form, acting 
on vector-valued functions: 


m 
L=}J_ Ajay+B. (19) 
j=l 


EXISTENCE OF SOLUTIONS OF PARTIAL DIFFERENTIAL EQUATIONS 113 


pe A; and B are square matrix valued functions of the independent variables Sj, 
j q p j 
A j (s) being once differentiable, B(s) continuous, and 


ð 
ð; = —. 
J as j 
We assume, for simplicity, that A j and B, as well as the functions on which L acts, 


are periodic in all variables s and that they are real valued. We denote, as customary, 
the formal adjoint of L by L*: 


L*=—9 aj AT +B, (19") 


where A’, BT denote transposes. Integration by parts shows that for any pair of C! 
vector-valued periodic functions u and v, 


(v, Lu) = (L* v, u). (20) 


Here the bracket denotes the L? scalar product over a period cube: 


(v, w) = | u(s)- w(s) ds; 
F 


the dot denotes the dot product between vectors, and F a period cube. 
Suppose that every A; is symmetric: AF = Aj. Then comparing (19) and (19°), 
we deduce that 


n a Ajj t+B+B". 


Here Aj, j denotes the partial derivative of Aj with respect to sj. Setting this into 
(20), and choosing v = u, we get 


2(u, Lu) = ((L + L*) u, u) = ([2 4+ BT - LA] u, u) : (20) 
Using the language of distributions, see Appendix B, we state 


Theorem 5. Suppose the matrix on the right in (20°) is positive definite: 
B+B- Ajj>kl,  k>0. (21) 
Then for every periodic, square integrable f the equation 


oo Sct ae octets tia. sn wena dh: dog a nce (iment 3 Ly = f- ii a. Ee <8 


has a solution y in the sense of distributions that is per iodic and square integr cable. 


Proof. It follows from (20’) and (21) that every periodic C! function satisfies the 
inequality 


k 
(u, Lu) > = lull?, (21^) 


114 APPLICATIONS OF WEAK CONVERGENCE 


where |lz|| denotes the L?(F)-norm. Denote the Hilbert space LMF ) by H. Let Y 
-~——be-any. finite-dimensional.subspace of H consisting of periodic C! functions; denote 
the orthogonal complement of Y in H by Y+. Consider the equation 


Ly f € y+ (22y) 


, fok yin Y. These ; are N linear equations f for yin y N= dimY. According to 


linear algebra, such a system of linear equations “has a solution for every f iff the 


homogeneous equation 
Lee Yt, zim¥, (23) 
-jf Satisfied only by 20. Take the scalar product of (23) with z; using (21' ), we get 
O= (2 Le = 5 te I, 


which implies that z = 0. So it follows that (22y) has a unique solution y. Take the 
scalar product of (22y) with y; using (21’) and the Schwarz inequality, we get 


k 2 
5 br s 0 Ly) = O, f) <A. 


Ae This implies that 


2 
Ii s g IA : (24) 


Now let Yy be an increasing sequence of subspaces of C! functions whose union 
is dense in H. Denote by yy the solution of (22y). It follows from (24) that ||yw| 
is a uniformly bounded sequence. Therefore, since H is reflexive, we appeal to the- 
orem 7 of chapter 10 to conclude that a subsequence of {yy}, also denoted as {yw}, 
converges weakly: 


w — lim yy =y. 


Let v belong to U Yy; since each Yy consists of differentiable functions, v is 
differentiable, for it belongs to some Ym. For yy in Yy, 


Lyn — f € Yq. 
Take the scalar product of this with v; for N > M we get 
Ww, Lyn) — w, f) = 0. 
Since v is differentiable, this can be rewritten by (20), as 


(L* v, yn) — (v, f) =0. 


THE REPRESENTATION OF ANALYTIC FUNCTIONS WITH POSITIVE REAL PART 115 


Since the sequence yy converges weakly to y, we conclude that for every v in U Yy 
(L* v, y) — (uv, f) =0. (25) 


We can choose the spaces Yy so that their union is dense not only in H but also in 
Hj, the space of all periodic L? functions whose first derivatives belong to L?. That 
means that given any periodic C l function v, there is a sequence {vg} of functions 
in U Yy such that vg converges in the L*-norm to v, and the first derivatives of vg 
converge to the first derivatives of v in the L?-norm. Since L* is a first-order operator, 
it follows that L* vz tends to L* v in the L?-norm. Setting v = vg in (25), we can 
pass to the limit and conclude that (25) holds for all v in C!. 

A function y that satisfies (25) for all C l functions v is said to satisfy the differ- 
ential equation (22) in the weak sense. Clearly, such a v is a solution of (22) in the 
sense of distributions, which requires (25) to hold for all Ce functions v. o 


Friedrichs has shown that a weak solution y of (22) is a strong solution in the 
following sense: there is a sequence of C ' functions z» that converge to y in the L? 
sense, and at the same time Lz, converges to f in the L? sense. It is easy to show, 
using (21’), that equation (22) has only one strong solution. It follows that not only a 
subsequence, but the whole sequence yy converges. 

The method described in this section to obtain the solution y of equation (22) as 
the weak limit of the solutions yy of equation (22y) is called Galerkin’s method. It 
is more than a theoretical device for proving the existence of a solution of (22); it is 
also a practical method for constructing it. l 


11.6 THE REPRESENTATION OF ANALYTIC FUNCTIONS 
WITH POSITIVE REAL PART 


Let. f(¢) be an analytic function in the unit disk |¢| < 1 whose real part is positive: 


A(f) = Re f(g) > 0, Ii <1. 


Every analytic function defined in a disk and continuous up to the boundary can be 
expressed—up to an imaginary constant—in terms of its real part on the boundary 
by the Poisson integral. On the disk of radius R < 1 we have for |t| < R 


R— ç e-i? 
Setting ¢ = 0, we see that 


h(0) = 4 h (R e°) d0. (26') 


Let R — 1 through a sequence Ra —> 1. The functions h(Ry e!) are nonnegative 
functions of 9 whose integrals over the whole circle are, by (26’), all equal to (0). 


2x -i0 . 
Hef h- (Re'®) E Ci (26) 
0 


PEENES Er EN 


© 
116 APPLICATIONS OF WEAK CONVERGENCE 


We associate with sach Rp a linear functional 


ee l 
t) = [- h (Rn 2) u(6) d8 (27) 
0 


acting on the space C of continuous functions u on the circle sh ‘Tt follows from 
mmjp Gand (26) thats se emcee oe i asane et iat 


lfn| = h(0). 


Since C(S!) is separable, we can appeal to Helly’s theorem, theorem 12 of chap- 
i. ter.10,_and_conclude.that.a subsequence of a) ‘is weak*-convergent-to some limit £: 


lim £,(u) = £(u) (28) 
n= ; 


for all continuous functions u. It follows from (28) and the uniform boundedness of 
\£,| that for any sequence u, strongly convergent to u 


lim £;)(un) = £(u). (28’) 
n—> OO 
We apply this now to 
ooe Rage? lorr 
ly = =: Ss 
n Rat e7ié Le eid 


¢ any complex number with |¢| < 1; using (26), (27) and (28), we get 


= 


ro=ef a 


The functionals £; defined by (27) are clearly nonnegative; therefore so is their 
weak* limit £. According to corollary 14’ of the Riesz representation theorem, chap- 
ter 8, such a nonnegative functional acting on C(S l) can be represented as an integral 
with respect to a positive measure m. Thus we have proved the first part of 


Theorem 6 (Herglotz-Riesz). Every analytic function f in the unit disk \f| < 1 
whose real part is positive there can be expressed as 


Lace z 
f) = I- teið dm + ic, m a positive measure, c real. (29) 
-te 


Conversely every function f so represented is analytic in the unit disk and has posi- 
tive real part there. The representation (29) is unique. 


Proof. That (29) represents an analytic function with positive real part in the unit 
disk for any positive measure m is evident from formula (30) below. To see that the 


BIBLIOGRAPHY 117 


representation is unique, we note that the real part of (29) is 


l-r? ; 
= ee = ip 
h(t) J (eee dm, f$=re®, (30) 


Take any continuous function u ($), multiply (30) by u(@), and integrate with respect 
to @ over S!. We get, after interchanging the order of integration on the right, 


fr (re?) ugdo = | u-(6)am, (31) 
where 
7 o= f| (o)d 
Sb E oO 


Suppose that #(¢) can be represented in the form (30) by two different measures m 
and m’. Let r —> 1 in (31); by theorem 1 of this chapter, u, —> u in the maximum 
norm. Since the left side of (31) does not depend on the representing measure, it 


follows that 
[oan = [uo an’ 


for all continuous functions u. We appeal now to the uniqueness of measure in the 
Riesz representation theorem to conclude that m = m’. This completes the proof of 
theorem 6. g 


From the uniqueness of the measure m it follows that the limit (28) exists not only 
for a subsequence but every sequence of R. 


BIBLIOGRAPHY 


Friedrichs, K. O. The identity of the weak and strong extension of differential operators. Trans. AMS, 55 
TCO 132-151777 } 8 


Herglotz, G. Uber Potenzreihen mit positivem reellem Teil in Einheitskreis. S.B. Sächs. Akad. Wiss., 63 
(1911): 501-511. 


Riesz, F. Sur certains systémes singuliers d’equations intégrales. An. l'École Normale Sup. (3), 28 (1911): 


~~ Toeplitz, O. Uber allgemeine lineare Mittelbildungen. Prace Math.—Fiz., 22 (1911): 113-119, 


THE WEAK AND. 


~ WEAK* TOPOLOGIES ~~ 


Definition. The weak topology in a Banach space is the weakest topology in which 
all bounded linear functionals are continuous. Since bounded linear functionals are 
continuous in the norm (strong) topology it follows that the weak topology is coarser 
than the strong topology. We show now that—except in finite-dimensional spaces— 
the weak topology is genuinely coarser than the strong topology: 


The open sets in-the-weak-topology are unions of finite intersections of sets of the 
form 


{x :a < £(x) < b}. (1) 


Clearly, in an infinite-dimensional space the intersection of a finite number of sets of 
form (1) is unbounded..This shows that every set that is open in the weak topology is 
unbounded. In particular, the balls 


{x : |x] < R}, == 2) 


open in the strong topology, are not open in the weak topology. 

Next. we show that the weak topology is coarser than weak sequential conver- 
gence, in the following sense: define the weak sequential closure of a set S in a 
Banach space X as the weak limit of all weakly convergent sequences in S. 


Theorem 1. 


(i) The weak sequential closure of any set S belongs to the closure of S in the 
weak topology. 

(ii) In every infinite-dimensional Banach space there are sets weakly sequentially 
closed, but not closed in the weak topology. 


Proof. Part (i) is an immediate consequence of the definitions of weak conver- 
gence and weak topology. Part (ii) is exemplified by the following set S: 


118 


THE WEAK AND WEAK* TOPOLOGIES : 119 


C= SUS. (3) 


Each set S% is finite, constructed as follows: Choose any sequence of subspaces 
Xk of X, dim Xg = k. Sg consists of points xx, j, finite in number, so chosen that for 
every point x in Xx of norm k, x in Xx, |x| = k, there is a point xg, ; in Sẹ such that 


l 
|x — xk jl < T |xk j| = k. l (4) 


We claim that the origin belongs to the closure of $ in the weak topology. Any open 
set containing the origin contains a subset of the form 


{x [él <e, i=l,...,n}, (5) 


where the £; are linear functionals of norm 1. Since X, is k-dimensional, for k > n 
it contains a nonzero vector x, such that 


li (xk) = 0, te re | 
[xp] = k. (6) 


By construction, there is an xg, ; in Sy satisfying condition (4) for x = xg. Using (4), 
(6), and [£;| = 1, we get for any 2 = 2;,i = 1,...,n, that 


I 5 
Elk, j) = lk, j — Xk) S |Xkj — xkl < 7 (7) 
Therefore for k > 1/e, the point xg, j belongs to the subset (5). This proves that the 
origin belongs to the closure of S in the weak topology. 
On the other hand, S contains only a finite number of points in any ball of radius 
R. So, by the principle of uniform boundedness (see theorem 4 of chapter 10), $ 


contains no weakly convergent sequences other than the trivial ones. o` 


Despite the coarseness of the weak topology compared to the strong topology, the 
following is true: i 


Theörem 2. Every convex subset K ofa Banach space X that is closed in the strong 
topology is closed in the weak topology. . 


Proof. We will show that if z in X does not belong to K, then z is not in the 


__._weak closure of K. Since K is closed in the strong topology, there is an open ball 


Br(z) centered at z that is disjoint from K. According to the hyperplane separation 
~ theorem, theorem 6 of chapter 3, there is a nonzero functional £ and a constant c such 
that 


E(u) <c < £(v) e. (8) 


for all u in K and all v in Br(z). As explained in the proof of theorem 17 of chapter 8, 
the norm of £ is bounded by 1/R. The points of Br(z) are of the form v = z+ x, 


120 THE WEAK AND WEAK* TOPOLOGIES 


Ix] < R. Since £(v) = £(z) + £(x), it follows that 


inf &(v) = £(z) + | ie eee = £(z) — HIR. 
x|< 


v in Br(z) 


Setting this into (8), we get that £(z) > c; so we conclude, again by (8), that the 
hyperplane 


RE ee Sn wo Ot | eee _ p we 8!) 


contains z but contains no point of K. Since (8)’ is. an open set in the weak topology, 

it follows that z does not belong to the closure of K in the weak topology. O 
Theorem 2 bears an analogy to theorem 6, chapter 10. ee 

Suppose that U is a Banach space that is the dual of another Banach space X: 


U=X'. l (9) 


Then there is in U a natural class of linear functionals,those associated with elements 
of X: 


x(u) = u(x). (10) 


Definition. The weak* topology in a Banach space U that is the dual of another 
Banach space X is thé Crudest topology in which all linear functionals (10) are con- 
tinuous. 


For U of form (9) and nonreflexive, the weak* topology is genuinely coarser than 
the weak topology, as will be clear from the following theorems. The first is due to 
_ Alaoglu. 


Theorem 3. The closed unit ball B in a Banach space U that is he dual ofa another 
Banach space X is compact in the weak* topology. 


Proof. To u in B we assign the array of numbers {u(x)}, x in X. Since |u| < 1, 
|u(x)| < |x|. Regard each array of numbers as a point in the product space P: 


P= [| x  de=l-lel xl], (11) 
xin X 


and regard 
u — {u(x)} (12) 


as a mapping of B into P. The elements of P are vectors whose components are 
points in the factors Jx. The natural mappings in P are the projections of vectors 
onto their components, and the natural topology is the weakest one in which all these 
mappings are continuous. The mapping (12) is one-to-one, so it embeds B in P. It 
follows from the definitions that the weak* topology of B is the same as the inherited 


BIBLIOGRAPHY 121 


topology from P under the embedding. Each /, is a compact interval of R; it follows 
from Tychonov’s theorem that P is compact. Since a closed subset of a compact set 
is compact, it suffices to prove that (12) maps B into a closed subset of P. 

Let p be a point in the closure of the image of B under (12); we will show that 
then p is the image of some u in B, that is, 


Py = u(x) for all x in X, (13) 


where {px} denote the components of p. Equation (13) defines a function u(x) on 
X. We have to show that it is bounded by 1 and linear. Boundedness follows from 
the fact that py belongs to Zy = [—|x|, |x|]. Linearity means that 


Pxty = Px + Py, Pax = apx. (14) 
For every q that is the image of B in P under the mapping (12), 
x+y = ax +4y, qax = a4gx. (15) 


Since p lies in the weak* closure of {q}, and since these relations involve only 3 
(resp. 2) components of q, (14) follows from (15). z 


Theorem 3’. A subset S of U = X' that is closed in the w* topology is w* compact 
iff it is bounded in norm. 


Proof. If S is bounded in norm, it belongs to some closed ball Br. According to 
theorem 3, Br is w*—compact, but then so is its w*—closed subset S. 

Conversely, suppose that S is w*—compact; then so is its image {w(x)} under the 
continuous mapping u — u(x) for every x. A compact set in R (or C) is bounded, 
so for every x in X {|u(x)|} < b(x) for every u in S. But then by theorem 3 of 
chapter 10, the principle of uniform boundedness, |u| < b for all u in S. 


Theorem 4. The closed unit ball in a Banach space Z is compact in the weak topol- 
ogy iff Z is reflexive. 


—-Proof. The “if” part follows from theorem 3. The “only if” is due to Eberlein and 
Smulyan; see Dunford and Schwartz. 


BIBLIOGRAPHY 


- Alaoglu, L. Weak topologies of normed linear spaces. An. Math., 41 (1940): 252-267, 


Chernoff, P. R. A simple proof of Tychonoff’s theorem via nets. Am. Math, Monthly, 99 (1992): 932-934. 


Dunford, N. and Schwartz, J. Linear Operators: Part 1: General Theory. Wiley—Interscience, 1957, 
pp. 423-425. 


_ Eberlein, \ W. F. Weak compactness in Banach spaces. Proc. Nat. Acad. Sci. USA, 33 (1947): 51~53. 


Smulyan, V. I. Uber lineare topologische Räume. Math. Sbornik, N.S., 7 (1940): 425-448. 


Tychonoff, A. Über die topologische Erweiterung von Räumen. Math. An., 102 (1929-30): 544-561. 


tb 


-LOCALLY CONVEX 


— TOPOLOGIES ANDTHE = 
KREIN-MILMAN THEOREM 


The weak and weak* topologies are the weakest in which certain linear functionals 
are continuous. If one demands the continuity of even fewer functionals, one gets 
even weaker topologies. All these topologies have the property that openness can be 
defined in terms of convex sets. In this chapter we develop, and apply, the theory of 
such topologies. , 


Definition. A locally convex topological (LCT) linear space is a linear space over 
the reals with a Hausdorff topology that has the following properties: 


(i) Addition is continuous; that is, (x, y) —> x + y is a continuous mapping of 
X x X into X. 

(ii) Multiplication by scalars is continuous; that is, (k, x) —> kx is a continuous 
mapping of R x X into X. 


(iii) There is a basis for the open sets at the origin consisting of convex sets; that is, 
every open set containing the origin contains a convex open set containing the 
origin. 


Note that the norm topology of a Banach space is a locally convex topology; the 
convex, open sets containing the origin that form a basis for the topology are the 
open balls centered at the origin. 


Exercise 1. Show that the weak and weak* topologies are locally convex. 


Exercise 2. Let {la} be a collection of linear functions in a linear space X over R 
that separates points; that is, for any two distinct points x and y of X there is an Ly 
such that £g (x) Æ le(y). 


122 


oat 


SEPARATION OF POINTS BY LINEAR FUNCTIONALS 123 


(a) Show that the weakest topology in which all the 2, are continuous is locally 
convex. 

(b) Show that a linear functional £ is continuous in the topology above iff it is a 
finite linear combination of the £g. 


Exercise 3. Show that a continuous function f(a, b) on a product of two topological 
spaces whose values lie in a topological space is a continuous function of a when b 
is fixed. 


Theorem 1. 


(i) In a LCT linear space X the collection of open sets is translation invariant; 
that is, if T is an open set, so T — x, for any x in X. 
(ii) If T is an open set, so is kT, fork # 0; in particular, —T is open. 
(iii) Every point of an open set T is interior to T. 


Proof. According to exercise 3, x + y is a continuous function of y for x fixed. 
The inverse image of the open set T under this mapping is T — x; this proves part (i). 
Part (ii) follows similarly. 

(iii) By exercise 3, kx is a continuous function of k for x fixed. So the set of k for 
which kx lies in some open set T is an open subset of R. Suppose that T contains the 
origin; then k = 0 belongs to this set, and by the above observation, so does an open 
interval containing k = 0. That means that for k small enough, kx belongs to T, but 
that is what it means for the origin to be an interior point of T. This proves (iii), for 
by (i) any point of T can be shifted to the origin. 


13.1 SEPARATION OF POINTS BY LINEAR FUNCTIONALS 


Theorem 2. The continuous linear functionals in a LCT linear space X separate 
points. That is, if y and z are distinct points of X, there is a continuous linear func- 
tional € such that ` 


Ely) Æ £(z). (1) 


Proof. We construct a linear functional separating y and z. Without loss of gen- 


erality we take y = 0. Since the topology is Hausdorff, there is an open set T con- 
aning y = 0 but not’z; by (iii) of the definition of LCT space, we may take T to be 
convex. By theorem 1, 0 is an interior point of T, so the gauge function pr of T is 


finite, and 


pr(u) <1 for all u in T. (2) 


According to the hyperplane separation theorem, theorem 5 of chapter 3, there exists 
a linear functional £ satisfying 


ye 


124 LOCALLY CONVEX TOPOLOGIES AND THE KREIN-MILMAN THEOREM 


£(z) = 1, (3) 
L(x) < pr(x) for all x. (4) 


Clearly, since £(y) = £(0) = 0, £ separates y and z. 
To complete the proof, we have te to show that £ is continuous. We first show that 


every halfspace w : £(w) < c is 5 open. ` We claim that if w belongs to the halfspace, 


ae does the open set 


WERT F a es; 
-Using (4), (5), and (2), we get thar ior uinT, 
bw ru) = ET E T <= LSE Se O 


showing that every point of (5) lies in the halfspace; so the halfspace is open. We can 
. show similarly that every halfspace of the form 


w: £(w)>d 


is open. For this argument pr needs to be an even function; this will be the case 
if T is symmetric around this origin. This can be accomplished by replacing T, if 
necessary, by T N (—T). 


Theorem 2 can be sharpened as follows: 


Theorem 2’. Denote by K a closed, convex set in a LCT linear space X, z a point 
in X not in K. Then there is a continuous linear functional £ such that 


Ay) < c forall yin K, £(z) >c. (7) 


Proof. The proof is similar to that of theorem 2, except that in place of the hyper- 
plane separation theorem we use the extended version, theorem 6 in chapter 3. O 


Exercise 4. Let K denote a convex subset of a LCT linear space X. Show that the 
closure K of K also is convex. , 


13.2 THE KREIN-MILMAN THEOREM 


We recall from the end of chapter 1 the notion of an extreme subset of a convex set 
K and, in particular, the notion of an extreme point. A subset E of a convex set K is 
called an extreme subset of K if: 


(i) E is convex and nonempty. 


(ii) Whenever a point x of E is expressed as a convex combination of y and z in 
K, then both y and z belong to E. 


tion of extreme subsets of a convex set K is itself an extreme subset of K. 
__......We_conclude from.Zorn’s_lemma_that K_has a closed extreme subset EF that is 


THE KREIN-MILMAN THEOREM . l -< B5 


An extreme set consisting of a single point is called an extreme point. 
Exercise 5. Show that the nonempty intersection of extreme sets is extreme. 

The elementary properties of extreme set are contained in theorems 7 and 8 of 
chapter 1. The basic result concerning convex sets in finite dimensional spaces is the 


following theorem of Carathéodory: 


Every compact convex subset K in RN has extreme points, and every point of K 
can be written as a convex combination of N + 1 extreme points. 


Exercise 6. Furnish a proof of Carathéodory’s theorem by induction on N. 


M. G. Krein and D. P. Milman have given the following beautiful—and useful— 
generalization of this result: 


Theorem 3. Let X be a LCT linear space, K a nonempty, compact, convex subset 
of X. 


(i) K has at least one extreme point. 


(ii) K is the closure of the convex hull of its extreme points. 


Proof. Consider the collection {Æ ;} of all nonempty closed extreme subsets of K. 


This collection is nonempty, for it contains K itself. Partially order this collection by 


inclusion. We claim that every totally ordered subcollection {£;} has a lower bound. 
That lower bound is the intersection NE j. To see this, we have to show that NE; is 
nonempty, closed, and extreme. 

We claim that every finite subset of the totally ordered collection {Ej} has a 
nonempty intersection. This is because in being totally ordered by inclusion, the 
intersection of a finite subset of the collection {£;} is the smallest member of that 
subset. To conclude that ME; is nonempty, we argue indirectly: suppose that the in- 


_tersection is empty. Then the union of the complements of the E j cover K. Since K 


is compact, a finite collection of these already cover K, but then the intersection of 
these finite number of E; is empty, contrary to what we have already shown. Being 
the intersection of closed sets, NE; is closed. By exercise 5 the nonempty intersec- 


minimal with respect to inclusion. We claim that such an E consists of a single point. 
To see this, suppose, on the contrary, that Æ contains two distinct points. According 
to theorem 2, there exists a continuous linear functional £ that separates these points. 
Since E is compact, and £ continuous and not constant on Æ, £ achieves its maximum 


On some proper subset M of E. Since £ is continuous and Æ is closed, M is closed. 


Since the inverse image of an-extreme subset is extreme (see corollary 8’ in chap- 
ter 1), the set M where a linear functional £ assumes its maximum on a convex set 


ray 


126 LOCALLY CONVEX TOPOLOGIES AND THE KREIN-MILMAN THEOREM 


E is an extreme subset of E. It is easy to show further (see theorem 7 of chapter 1) 
----——that-if E-is-an-extreme subset of K, and M an extreme subset of £, then M is an 

extreme subset of K. Since E is a minimal extreme subset of K, and M an extreme 

subset smaller than Æ, we have a contradiction, into which we got by assuming that 

E contains more than one point. We conclude therefore that a minimal E consists of 

a single point. This single point is an extreme pomt of K. This comple the proof 
ion Qf-part-()-and-gives-a-little-more:: communes et ota at 


(i’) Every closed, extreme subset of K contains an extreme point. 


We turn now to the proof of part (ii). Denote by Ke the set of extreme points of 
—— K, and by Ke the convex hull of Ke? To show thatevery point’of K belongs to the 
closure of Ke is the same as showing that a point z that does not belong to the closure 


of Ke does not belong to K. According to exercise 4, the closure of Ke is convex. So, 
if z does not belong to the closure, then according to theorem 2’ there is a continuous 
linear functional £ such that 


fiy)<c  forallyin Ke, £) > c. (8) 


Since K is compact and £ continuous, £ achieves its maximum over K on some 
closed subset E of K. According to corollary 8’, chapter 1, E is an extreme subset 
of K. According to part (i’) of theorem 3 noted above, E contains some extreme point 
pof K. Since p belongs to Ke, and so to K, , it follows from (8) that £(p) < c. Since 
by construction £(p) = maxx £(x), L(x) < £(p) < c for all x in K. Since by (8), 
£(z) > c, this proves that z does not belong to K. o 


13.3 THE STONE-WEIERSTRASS THEOREM 


Theorem 4. Let S be a compact Hausdorff space, C(S) the set "P all real-valued 
continuous functions on S. Let E be a subalgebra of C(S), that is, 


(i) E is a linear subspace of C (S). 
(ii) The product of two functions in E belongs to E. 


In addition we impose the following conditions on E: 


(iii) E separates points of S, that is, given any pair of points p and q, p # q, there 
is a function f in E such that f (p) £ f (q). i 
(iv) All constant functions belong to E. 


Conclusion: E is dense in C (S) in the maximum norm. 


The classical Weierstrass theorem is a special case of this proposition, with S an 
interval of the x axis, and E the set of all polynomials in x. We present Louis de 
Branges’s elegant proof, based on the Krein-Milman theorem, of Stone’s generaliza- 
tion of the Weierstrass theorem. 


` THE STONE-WEIERSTRASS THEOREM 127 


Proof. According to the spanning criterion, theorem 8 of chapter 8, Æ is dense in 
C(S) if the only bounded linear functional £ on C(S) that is zero on Æ is the zero 
functional. According to the Riesz-Kakutani representation theorem, theorem 14 of 
chapter 8, the bounded linear functionals on C(S) are of the form 


uf) = Í fdv, 


v a signed measure of finite total variation ||v|| = f |dv|. So what we have to show 
is that if fẹ fdv = 0 forall f in E, v =0. 

Suppose not; denote by U the set of signed measures of finite total mass is < 1 
that annihilate all functions in Æ. This is a convex set, and according to Alaoglu’s 
theorem, theorem 3 in chapter 12, compact in the weak” topology. So according to 
the Krein-Milman theorem, if U contained a nonzero measure, it would contain a 
nonzero extreme point; call it u. Since yz is extreme, ||ul| = 1. Since E is an algebra, 
if f and g belong to E, so does gf. Since yz annihilates every function in £, 


J (ajina 


It follows that the measure gd u also annihilates every function in £. 
Let g be a function in E whose values lie between 0 and 1: 


0<g(p)<i for all p in S. 
Denote 
a = Igul = f gidu, ET EST = fà- oldu. 


Clearly a and b are positive. Add them: 


a+b= f idu = 1. 
The identity 


L= 


represents u as a nontrivial convex combination of gu /a and (1—g) w/b, both points 
in U. Since yz is an extreme point, u must be equal to gu/a. 

Define the support of the measure u to be the set of points p that have the property 
that fy Idu] > 0 for any open set N containing p. If y = gu/a, it follows that g 
has the same value at all points of the support of u. 

We claim that the support of u consists of a single point. To see this, suppose that 
both p and q, p # q, belong to the support u. Since the functions in E separate 
points of S, there is a function A in E, h(p) # h(q). Adding a large enough constant 


iy 


128 LOCALLY CONVEX TOPOLOGIES AND THE KREIN-MILMAN THEOREM 


to h and dividing it by another large constant, we obtain a function g whose values 
~lie-between 0 and-t;and g(p) 4 g(q). This contradicts our previous conclusion. 
A measure u whose support consists of a single point p, and |||] = 1, is a unit 
point mass at p. Therefore 


O agit STO IO) 


Since, by hypothesis, the constant I belong to £, J fan # Ofor f =TinZE,a 


contradiction. oO 


“334 CHOQUEP'S THEOREM 


The following further extension of Caratheodory’s theorem holds on locally convex 
linear spaces: | ` 


Theorem 5. Let X be a LCT linear space, K a nonempty compact, convex subset 
of X, Ke the set of extreme points of K. For any point u of K there is a probability 
measure ny, on Ke, the closure of Ke, tht is a measure satisfying: 


My > 0, i. dm = 1, (9) 
such that in the weak sense 
u = a edmy. (10) 


The weak sense of the integral representation above is that for every continuous 
linear functional £ over X, 


£(u) = f £(e) dmy(e). oe (10') 
e 
Proof. For any continuous £ and K compact, £ achieves its minimum and max- 
imum on K. According to corollary 8’ of chapter 1, the sets where £ achieves its 
minimum and maximum are extreme subsets of K. According to part (i') of theo- 
rem 3, these extreme subsets contain extreme points. Therefore for any u in K and 
for every continuous linear functional £, 


min £(p) < £(u) < max £(p). (11) 
pin Ke pin Ke 


It follows from (11) applied to £; — £2 that if £; and £2 are equal on Ke, they are 
equal on K; therefore £ on Ke uniquely determines the value of £(u) for every u 
in K. Denote the restriction of £ to Ke by f: 


fa) =a), qinKe. (12) 


CHOQUET’S THEOREM - 129 


Since (u) is determined by f, we can write 


e(u) = u(f), (12’) 


clearly a linear functional of f. We rewrite (11) as 


min f(q)<u(f)< max f(q). (13) 
qin Ke qin Ky 


The set L of functions f defined in (12) forma linear subspace of the space C (Kẹ) 
of continuous functions on Ky. We claim that we can extend the linear functional 
u(f) defined in (12’) from L to all of C (Ke) so that property (13) is preserved. 
To see this, we adjoin the function fọ = 1 to L and define u( fọ) = 1. Then we 
appeal to theorem | of chapter 4, according to which a positive linear functional can 
be extended positively to the space of all functions. Since (13) implies that u( f) is 
positive, such an extension is possible; imagine it done. 

Ke is a closed subset of the compact set K and is therefore compact. We appeal 
now to the Riesz-Kakutani representation theorem (see chapter 8, theorem 14) ac- 
cording to which a bounded linear functional u on C(K,) can be represented as 


u(f) = i f dm. (14) 
Ke 


Since the functional is positive, so is the representing measure m; it follows from 
u(fo) = | that m(Ke) = 1. Setting (12) and-(12’) into (14), we obtain (10%. i 


Theorem 5 asserts that every point u of the compact convex set K can be rep- 
resented as a continuous convex combination of points of the closure of the set of 
extreme points. This proviso is needed because the set of extreme points may not be 
closed, not even in a finite-dimensional space. Say we take in R? the convex hull of 
the circle: x? + y2 = ], z = 0, and the interval: 


x=1, y=0; —l<z<l. 


The extreme points of the convex hull are x = 1, y = 0, z = +1, and all points of 


_ the circle x? + y? = 1, z = O except x =1,y=0,z=0. 


Exercise 7, Let v be a point in K e that does not belong to Ke; show that v can then 


~~ be represented as 


va ff edm, 


where m is a probability measure on K such that m(v) = 0. 


Exercise 8. Deduce part (ii) of theorem 3 from theorem 5. 


130 — LOCALLY CONVEX TOPOLOGIES AND THE KREIN-MILMAN THEOREM 
Choquet gave the following sharpening of theorem 5: 


Theorem 6 (Choquet). Let K be a nonempty compact, convex subset of a LCT lin- l 
ear space, and assume in addition that K is metrizable. Then every point u of K can 
be represented in the weak sense as 


a -u =f edmy, (10’) 
a Ke_ vecmeneen me Hiya aea a 


- where m, is a probability measure on the set of extreme points. 
Proof. For proof, see Phelps. = = . D 


We call (10”) a Choquet-type representation. In the next chapter we give many 
examples of Choquet-type representations of convex sets. 

We present now a useful result that extends to LCT spaces, the following intu- 
itively clear property of convex hulls of compact sets S in finite-dimensional spaces: 
the points of § that are added to S to make it convex contain no extreme points. 


Theorem 7. Let X be a LCT linear space, S a compact subset of X. Suppose K, the 
closure of the convex hull of S, is compact. Then every extreme point of K belongs 
to S. 


Proof. Let N be any open convex set that contains the origin. The open sets y+ N, 
y in S form an open cover of S; since S is compact, a finite number of them cover S: 


Ug +N) DS. ` (15) 
Denote by S; the intersection (y; + N) N S; it follows from (15) that 
US; = S. (16) 


Denote the closure of the convex hull of S; by Ki. Since S; C S, it follows that 
Ki C K; since K; is closed and K assumed compact, it follows that each K; is 
compact. Next we need 


Lemma 8. Let Kı and Ka be a pair of compact convex sets in a LCT linear space. 
Then the convex hull of their union is compact. 


Proof. Since K, and K3 themselves are convex, it is easy to see that the convex 
hull of their union consists of all points of the form 


ayı + (1 —a)y2, ye Ky, y2 € Ko, O<a<l. (17) 
These points are images of the triple product 


K, x Ko xT, I = [0,1] (18) 


BIBLIOGRAPHY i 131 


under the mapping (17). The triple product (18) is compact, and according to the 
definition of a LCT space, the mapping (17) is continuous. It follows that the image 
of the compact set (18) is compact, as claimed in lemma 8. o 


From lemma 8 we deduce inductively that the convex hull of the union of a finite 
number of compact sets is compact. We turn to the compact sets K; defined above 
and claim that the convex hull of their union, denoted as CH[K, U... U Ky], con- 
tains K: 


K c CHIK, U... U Ky]. (19) 


We note that K; contains S;. Therefore, by. (16), K1 U...UK, contains S$; U...U 
Sy = S. By lemma 8, the right side of (19) is compact, and therefore closed. Thus it 
is a closed, convex set that contains S. But K is defined to be the smallest such set, 
and it therefore is contained in CH[K; U...U Kn]. This proves (19). In words, every 
point of K is a convex combination of points of Kj. Since each K; is contained in 
K, it follows from the definition of extreme point that each extreme point p of K 
belongs to a K;. 

By definition, S; is contained in y; + N. Since N is convex, the convex hull of S; 
belongs to y; + N: S; C y; + N. For any set R, R + N contains the closure of R. 
Since Ķ; is the closure of S; , it follows that K; C yi +N +N = yi +2N. Therefore, 
since each y; belongs to S, 


UK; CS+2N. 


We have shown that each extreme point of K belongs to some Kj, so it follows from 
the result above that every extreme point p of K belongs to S + 2N. Now N is 
arbitrary; since S is closed, the intersection of all sets S + 2N is S itself. Thus every 
extreme point p of K is contained in S. 

Theorem 7 is useful in identifying extreme points. 


Exercise 9. Show that if S is a compact set in a Banach space, its closed convex hull 
is compact. Is this true in every LCT space? 


NOTE. The prime-examples of LCT linear-spaces-are Banach-spaces in the weak and 
weak* topologies. Other important examples are spaces of distributions. In view of 
the enourmous success of the theory of distributions in the theory of partial differ- 
‘ential equations and in harmonic analysis, it was thought that other locally convex 

uitfulrole;that-hope-has-not-yet-been-reatized: 
seher applications: of the-Krein-Milman- theorem and- its-generalizations are de- 
scribed in Diestal and Uhl. 


BIBLIOGRAPHY 


Choquet, G. Existence des représentation intégrales au moyen des poini extremaux, dans les cones con- 
vexes. C. R. Acad. Sci. Paris, 243 (1956): 699-702. 


132 LOCALLY CONVEX TOPOLOGIES AND THE KREIN-MILMAN THEOREM 


de Branges, L. The Stone-Weierstrass theorem. Proc. AMS, 10 (1959): 822-824. 
“""Diéstél J. and Uhl, J. J. Vector Measures. American Mathematical Society, Providence, RI, 1970. 
Kelley, J. L. General Topology. Van Nostrand, Princeton, NJ, 1955. 
Kelley, J. L. and Namioka, I. Linear Topological Spaces. Van Nostrand, Princeton, NJ, 1963. 
Krein; M. G. and Milman;-D-On-extreme points-of regularly ‘convex sets. Studia Math., 9 (1940): 133- 


Phelps, R. R. Lectures on Choquet's Theorem. Van Nostrand, Princeton, NJ, 1966. — 


14 


EXAMPLES OF CONVEX > 
SETS AND THEIR 
EXTREME POINTS 


In this chapter we present a great variety of examples of convex sets, their extreme 
points, and Choquet-type integral representations of points of the set in terms of the 
extreme points. In some examples the extreme points are determined by a direct argu- 
ment. Then a locally convex topology is introduced so that the convex set in question 
is compact and the Choquet-type representation is then derived via Choquet’s the- 
orem. In other examples the Choquet-type representation is derived directly by an 
analytic argument. The representation is then used to identify the extreme points of 
the set. In most of these examples the representation is unique. 


14.1 POSITIVE FUNCTIONALS 


Let Q denote a compact Hausdorff space, and C(Q) the space of continuous func- 
tions on Q whose values are real. We denote a linear functional £ defined on C(Q) 
as positive if £(f) > 0 for all nonnegative f in C(Q). Recall from chapter 8 that a 


positive linear functional is bounded. Denote by P the collection of all positive linear __ 


functionals £ that satisfy 


e(l)=1. (1) 


Theorem 1. P is a convex set whose extreme points are the point evaluations er, . 


i defined as 
HE 2) 


r any point of Q. 


133 


134 EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 


Proof. The convexity of P is obvious. To see that every e,, defined by (2), is an 
extreme point, imagine e, represented as 


er =am + (1 — a)£, mandin P,0<a<l. (3) 


Let f be any function in C (Q) that is nonnegative and satisfies 


Set f into (3); using (2) and (4) obtains 


er(f) = f(r) =0=am(f) + CU — aje(f). 


‘We claim that m(f) and £(f) are both zero; otherwise, one of them would have 


to be positive, the other negative, in contradition to f > 0 and £,m both positive 
functionals. 
Every continuous f can be decomposed into its positive and negative part: 


J = J+- f- J+ = max(f, 0). 


Both f+ and f— are nonnegative, and if f(r) = 0, f+(r) = f(r) = 0. It follows 
from this and the foregoing that if f(r) = 0, m(f) = £(f) = 0. In other words, 
the nullspaces of m and of £ contain the nullspace of e,. Since the nullspace of a 
nontrivial functional has codimension 1, it follows that £ and m are constant multiples 
of e,; since all functionals in P satisfy (1), it follows that the constant multiplier is 1. 
This proves that £ = m = ep, SO ep is extreme.- 

We show now that the e, are the only extreme points of P. Let £ be any positive 
linear functional on C(Q), normalized by (1). According to the Riesz-Kakutani rep- 
resentation theorem, there exists a nonnegative measure m on Q such that for every 
continuous function f on Q, 


en = | fam. (5) 


. Because of the normalization (1), m( Q) = 1; the measure m is uniquely determined 
- by the functional £. We claim that the only extreme points of the set of positive linear 


functionals normalized by (1) are those where the measure m is concentrated at a 
single point. Otherwise, m can be split as am; + (1 — a)m2, where both m; and m2 


` are nonnegative measures of total mass 1, and m; # mp. Setting 


UP = | fam pH lz 


we get £ = al; + (1 — a)£2. If £ were an extreme point, £} = £2 = £, and so £ 
can be represented in form (5) by the distinct measures mj and m2. This contradicts 
uniqueness of the representing measure. This completes the proof of theorem 1. O 


CONVEX FUNCTIONS 135 


We can use formula (5) to rewrite the formula 2 symbolically as 


= fe dmi(r), . (6) 


a Choquet-type representation. 


14.2 CONVEX FUNCTIONS 


In this section we make use of the notions and results of the theory of distributions 
as explained in Appendix B. 


Definition. A real-valued function f in R” is convex if it satisfies 


f (ax) Lauren g 
for all choices of x4, ..., xy in R” and all a; satisfying aj > 0, i aj = 1. 


Here we consider convex functions f of a single variable. It suffices to assume 
(7) to hold for N = 2: 


flax+ (1 —a)z) <af(x)+ (1-a) f(z), O<a<l, (8) 


for all x, z. Setting ax + (1 — a): = y, condition (7) is easily seen to be equivalent 
with the following: for x < y < z, 


fo- fœ fOO 


yor = =y 


(9) 


It follows from (9) that every convex function is continuous and has right and left 
derivatives. 
The second difference quotients 


“F(x $A) — 2f (x) + f(x h) 
h2 


(10) 


converge in the sense of distributions to f” as h tends to zero. It follows from (9) 
that the difference quotients (10) are nonnegative; since the limit in the sense of dis- 


“ tributions of a nonnegative distribution is nonnegative, it follows that for convex f, 


o< f” (11) 


in the sense of distributions. 


~~ Convexity in an interval is defined in the same way; note that a function convex 
in an interval need not be continuous at the endpoints. 


136 EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 


We denote by C the set of a functions convex in the interval [0,-1] and satisfying 
(ORO “7a fer ee tort a 


Theorem 2. C is a convex set, whose extreme points are the functions 


l 0 forx<r 
Exe bot a E 
where 0 <r <l, and 
0 forx <1 P 
ae) | 1 a x=, (13!) 


Proof. First we show that all the functions e, defined above are extreme points 
of C. Suppose that e, is represented as 


er =af +(1—a)g. (14) 


We claim that both f and g are zero on [0, r]; for e, is, and so otherwise by (14) one 
of the functions f or g would be negative. Since this is contrary to (12), 


F(x) = g(x) = 0, O<x<r, 


follows. Similarly we claim that f(x) and g(x). are both equal to e, on [r, 1]. For if 
not, one of them would be > e, at some point y > r; a short calculation shows that 
this contradicts (9) with x =r, z = 1. This shows that e, is extreme. 

Let f be any convex function in [0, 1] satisfying (12). We set f(r) = 0 forr < 0. 
Clearly, f thus extended remains convex. Let @ be any cy test function that is zero 
for r > 0. Then according to the theory of distributions 


[ tetar= | s'ar 7 (15) 


where ' denotes differentiation with respect to r. We choose now x in0 < x < 1, 
and define the function ¢,(y) by 


x—r forr <x, 
x0) = i forr > x. (16) 
The function ¢, is piecewise linear and @ = ô(r —x). If we could substitute 6 = x 
in (15), we would obtain 


f(x) = [ese f"() dr. (17) 
Since ġx isnot C™, this is not legitimate, so we approximate @, by a sequence @§ of 


C@ functions. Since f(r) is continuous forr < 1, the left side of (15) tends to the 
left side of (17). On the other hand, the nonnegative distribution f” is a nonnegative 


COMPLETELY MONOTONE FUNCTIONS 137 


measure; therefore the right side of (15) tends to the right side of (17). This proves 
(17) for x < 1. We can rewrite it using notation (13) as follows: 


f= fewa —r) f"(r) dr, x<l. (18) 
We let x —> 1 and obtain 
m = tim fie) = [0-0 f'ar a8) 


Since f is an increasing function, and f (1) = 1, m} < 1. We can combine formulas 
(18) and (18^) into one and write 


1 
f@)= | er(x)dm(r), (19) 
where 
A 
m(r) = a =s) f"(s) ds forr < 1, and m(1) =1—my. (19’) 
0 
The measure m is uniquely determined by the convex function f; this follows readily 
from formula (19). It follows, as in section 14.1, that the only extreme points of the 
set C of convex functions satisfying conditions (12) are those where the measure m 
is concentrated at a single point of the interval [0,1]. This completes the proof of 


theorem 2. 5 


We can rewrite (19) symbolically as 


l 
f= | edm, (20) 
0 
a Choquet-type representation of convex functions. 
“Exercise I. Find a version of theorem 2 for Convex furictiotis of 1 variables. ` 


Exercise 2. Prove theorem 2 without the theory of distributions, using the theorem 
of Krein-Milman. 


14.3 COMPLETELY MONOTONE FUNCTIONS 


The difference operator Da, acting on functions of a single variable, is defined by 


no ne ea Fite (21) 


For a > 0, Da maps functions f defined on R+ into functions defined on R+. 


138 EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 


-~ Definition. A real-valued function defined on Ris called completely monotone 


‘The following result is due to S. Bernstein: 


(c.m.) if 


(=1)” (i pa) fz20 onRy (22) 


for alla; > 0 and all n =0,1,.... 


Theorem 3. Every completely monotone function f on R4 can be represented as 


fa) = [ e™“ dm(A), (23) 
0 


m some nonnegative measure, m(R+) < oo. Conversely, every function of form (23) 
is completely monotone. 


Proof, To show that f of form (23) is completely monotone we write 


ioe) 
Daf = i Dae dm(A) = J (e7 — 1)e™dm(A); 
0 


then a 


(-1)" (i pa) f= [Ta — e744 ye—¥ dm 
1 


is clearly nonnegative. 
Turning to the direct part, we will make use of the following properties of c.m. 
functions. 


Lemma 4. 
(i) The sum of two c.m. functions is c.m. 
Suppose that f is a c.m. function; then 


(ii) f is nonnegative. 
(iii) af is c.m. fora > 0. 

(iv) —Da f is c.m. fora > 0. 

(v) Taf = f(t +a) is c.m. fora > 0. 
(vi) Hp f = f (bt) is c.m. for b > 0. 
(vii) f is nonincreasing. 

(viii) f is convex. 


COMPLETELY MONOTONE FUNCTIONS 139 


Proof. Parts (i) and (iii) follow from the fact that the operators Da appearing in 
condition (22) characterizing c.m. functions are linear. Part (ii) is (22), parts (iv) and | 
(v) can be deduced by applying the operators Da, respectively Ta, to (22), and noting 
that these operators commute with Dg. Part (vi) follows by applying Hy, to (22), and 
noting that 


Ay Da = Dap-' Hp. 
Part (vii) follows from (22) for n = 1, and (viii) from (22) for n = 2, o 


We define X to be the space of all real-valued functions on R., and take K to be 
the subset of all c.m. functions, in the sense of (22), normalized by 


fO =li. (24) 
It follows from parts (i) and (iii) of lemma 4 that K is a convex set. 


Lemma 5. The extreme points of K are of the form 


Àt 


e(t)=e”, 0< <œ, (25a) 
and 
_]O rt>0 
Bo Van np pee 0 (23b) 


Proof. By parts (ii) and (vii) of lemma 4, every c.m. function is nonnegative and 
nonincreasing. It follows from (24) that every f in K satisfies 


0< f(t) <1. (26) 
Let e be an extreme point of K; then in ET 

0 <e(t) <1. (26’) 
Suppose that for all a > 0, strict inequality holds: OO i 


0 <e(a) <1. (26!) 


We define-two. auxiliary: functions-as-f0lOWS t--—2--2--m cere eee aaee cee ves 


e(t) — e(t +a) 


ft) = E 
= e(t +a) 
£ (t) p ela) - ON n Seated ee athe sank ee) etn 


It follows from part (iii), (iv), and (v) of lemma 4 and (26”) that f and g belong to K. 


oar 


140 EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 
~~ Clearly, 


e=(1—e(a))f +ela)g. T) T 


By definition of extreme point given in the previous chapter, it follows from (27’) 
that f = g = e. In particular, by (27), this implies that for all ¢ and a, 


2@e@ =elthay OOOO 


All continuous solutions of this equation are exponential functions. It follows from 
part (viii) of lemma 4 that every f in K is convex, and sois continuous for t > 0. 
We.can conclude that_....__.. en oe the i AE ee Zk ae á 


elt) = e^, 
That à is > 0 follows from part (vii) of lemma 4. 

The cases where (26”) fails to hold for some a > 0 can be easily handled. When 
e(a) = 1 for some a > 0, e = eo; when e(a) = 0 for some a > 0, e = eg. o 


We introduce now the topology for functions that is the coarsest in which all the 
linear functionals £;: 


£(f) = FO), O<t, (29) 


are continuous. The topology is the product topology 


Hro. 


O<r 


According to (26), the values of f in K lie between 0 and 1. So K is a subset of 


| [t0. n, 


O<t 


which by Tychonov’s theorem is compact. So to show K compact, it suffices to show 
that K is closed. But this is easy: for fixed a; and 1, the set of f that satisfy (22) is 
clearly closed. K, being the intersection of these sets for all aj and all t > 0, is 
closed. 
We showed in lemma 5 that the extreme points of K are contained in the set {e;.}, 
0 < à < œ defined by (25a), (25b). The set {e,}, itis easy enough to show, is closed 
and therefore contains the closure of the set of extreme points. S 
We appeal now to formula (10) in theorem 5-of chapter 13. That formula with e 
given by (25) and £ by (29), is precisely the desired representation formula (23). 
o 


Some corollaries and addenda: 


THEOREMS OF CARATHEODORY AND BOCHNER id} 


Theorem 6. 


(i) Every completely monotone function is C”. 
(ii) The representation (23) is unique. 
(iii) Every e) defined by (25a), 0 < à < © is an extreme point. 


Proof. Since the measure m is > 0 and m(R+.) < oo, we can differentiate (23) 
with respect to ¢ under the integral sign; this proves (i). 

For (ii) suppose that some f in K had two distinct representations. Subtracting 
them, we get 


foe) 
Í e™dv(à) =0, 
0 


v some signed measure of finite total mass over R. The function 


~~ 
F(t) =f eS dv(A) 
0 


is then an analytic function continuous in the right half-plane Ret > 0 that vanishes 
on the real axis ¢ = t. It follows that F(¢) = 0, in particular, that 


F(it) =0 for all real t. 


F (it) is the Fourier transformation of the measure dv. By uniqueness of the Fourier 
transform, we conclude that dv = 0, meaning no f can have two different represen- 
tations of form (23). 

For (iii), since K is compact, by Krein-Milman, it has at least one extreme point. 
By lemma 5, the extreme points must be of the form e} given by (25). They must 
include more than eg and eoo since the convex combinations of e} and es do not 
include all of K. So some e}, 4 4 0, œo is an extreme point of K. According to 
lemma 4 (vi), the operator Hp maps K into K; being a one-to-one linear map, Hy 
carries every extreme point of K into an extreme point of K. So, if e} is extreme, so 
is 


Apex = 2b, E b >0. 


This completes part (iii) of theorem 6. J 


..~ We remark-that-an-analogue.of Bernstein’s-theorem-holds in-x-dimensional space-——___— 
namely for functions defined on Ri. It also holds for functions.on Zi -s ee 


14.4 THEOREMS OF CARATHEODORY AND BOCHNER 


Definition. A skew-symmetric doubly infinite sequence {an} of complex numbers: .... ... 


a—n = Gn, (30) 


AA 


ee 


142 EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 


is called positive definite if - 
D ant L a aG I 


nk 


for all finite sets of complex numbers @,, -N <n <N. - 


_The following result is due to Toeplitz, Carathéodory, and Herglotz: 


Theorem 7. All positive definite sequences can be represented uniquely as 


TETN EREE wand, deaf Me Soo syed Sa Š 
Oy = | e dm 0 ONEST (¢ jy A E 


w 


where m is a nonnegative measure on S}. Conversely, every sequence of form (32) is 
positive definite. 


Proof. First we show that every sequence of form (32) is positive definite. Substi- 
tute (32) for a,—, into the left side of (31): 


> I ein) dm on dk = / So ie bn be dm 


n,k n,k 


= / pz én) (Eeh) dm 
n k 
= / | etn) dm 


is clearly nonnegative. 
We turn now to the direct part of the theorem. We claim that if {a,} is a positive 
definite sequence, then 


lam| < ag for all integers m. (33) 


To see this, set ġo = 1, m = @, and all other ¢, = 0. Substituting this into (31), we 
get, using (30), that 


ag + amd + Tm + alol? >0° 
for all complex @; this implies (33). 


According to the theory of distributions it follows from (33) that there exists a 
distribution a whose Fourier coefficients are ay: 


dn = f eado. (34) 
si 


THEOREMS OF CARATHEODORY AND BOCHNER 143 


For any C® function yw, 


| Tadd = Y` Vnan, G4’) 
n 


where yg are the Fourier coefficients of y. It follows from (33) that the right side 
converges. We claim that a is nonnegative; to see this, take any trigonometric poly- 
nomial qy of degree N, 


N 
qn (8) = Soon gine, 
-N 


Then {gy (0)? = ork dnd, ef"? Set y = lqn |? in (34’). We get an expression 
on the right that, by (31), is nonnegative: 


fata (0)|?a@ = > ann on bk > 0. (35) 


Let q(@) be any C° function on S |. it is easy and classical to show that q can be 
approximated by a sequence {qy } of trigonometric polynomials in the C®° topology. 
By definition of distribution, as N tends to oo, and (35) tends to 


J iq@)Pade>0, . (36) 


for all C% functions q. Let p(@) be any C© function that is positive on S}. Then 


q(8) = v p(8) 


is a C© function; therefore (36) implies that 


f prado >0 37) 


for any positive C°? function p. A distribution a with this property is called nonneg- 
ative. It is a classical result of the theory of distributions, see Appendix B, that every 


nonnegative distribution is a nonegative measure. Thus a d = dm and formula (34) 


is the desired formula (32). 


Theorem 7 can be extended to functions a defined on Z*, k any positive integer; 
the proof is the same. 


NOTE. Carathéodory’s own proof made use of his theorem on convex sets in finite- 
dimensional spaces. In section 14.6 we will give yet another proof, using the theory 
of positive harmonic functions. 


E 


a 


144 EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 


` Exercise 3. Denote by P the set of all positive-definite sequences normalized by 


ay = 1. oE aB aie 


(a) Show that P is a convex subset of the space £°° of all bounded sequences. 
Show that P is a compact subset of £°° in the product topology. 


(b) Show that the extreme points of P are of the form | 


AERON aire Re ed 


an = e, ~ BA 


and deduce the representation (32) using theorem 4 of chapter 13. 


An DO extension of Ciminéndony: 5 aroem is die to Bochner” 


Definition. A skew-symmetric complex-valued continuous function a(s) on R: 


a(—s) = a(s), © (40) 
is called positive-definite if 
$ 46s; = sjok 0 (41) 
for all choices of s1, ..., sẹ on R, and for all complex numbers ¢),..., ỌN- 


Exercise 4. Show that condition (41) is equivalent to the requirement that 


[ [as — t)o(s)b(t) ds dt > 0 (41’) 
for all continuous, complex-valued functions ¢ with compact support. 


Theorem 8. Every continuous positive-definite function a can be represented 
uniquely as 


a(s) = f éan, (42) 


m a nonnegative measure on R, m(R) < co. Conversely, every function of form (12) 
is positive-definite. 


Proof. We show first that every function of form (42) is positive definite. Setting 
(42) into the left side of (41’) yields 


f| [eoio asa mE 


where @ is the Fourier transform of ġ. Clearly, the right side is nonnegative. 


THEOREMS OF CARATHEODORY AND BOCHNER 145 


To construct the measure m for a given positive-definite function, we proceed as 
in the discrete case. We deduce from (41), analogously to (33), that 


la(s)| < a(0). (43) 
We recall from section B.5 of Appendix B the Schwartz class of functions S, con- 
sisting of functions f(s) all of whose derivatives 3? f (s), n = 0, 1,..., tend to zero 
faster than Is|7* as |s| —> oo, for any k. S’ is the dual of S; its elements are called 
tempered distributions. The function a, is, according to (43), bounded; therefore it 


belongs to S’. Therefore a has a Fourier inverse b that also belongs to S’. The Parse- 
val relation 


[ride =f afas (44) 
holds for all f in S, where f denotes the Fourier transform of f. 


According to exercise 4, (41’) holds for all C° functions @ with compact support. 
Introduce in (41’) s — t = r and s as new variables: 


J J a(r)$(s)@(s —r)ds dr > 0. (45) 
Denote by f the convolution 
f eoe - nas = fon: (46) 
f belongs to C™ and has compact support, and (45) can be written as 
fowtarzo | (45" 


Denote by y the Fourier transform of ġ; taking the Fourier transform of (46) gives 


eee IPoP = fo). (46) 


Formula (44) expresses the left side of (45’) in terms of b and f; using formula (46’) 
for f, we get 


Pee ened faoa > Q a (45!) as 


Let p(o) be any nonnegative C% function with compact support on R. Then y = 
p!’ too is C with compact support, and so belongs to S. Setting Y? = p into 
(45”), we get 


[re penta =O 


ra 


146 l EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 


for every nonnegative C™ function p with compact support. Such a distribution b 


is called nonnegative. According to theorem 13 of Appendix B, b is a nonnegative ` 


measure: b(o) do = dm. Aee aaa acts 
We claim that the total mass of m is finite: “°° 0" > 


S 7 ow 
[am =f b(o)da<co. ` i (4T) 


For let g be any nonnegative C% function of compact support that is egual T on 
[—1, 1]. Define gn(o) = g(a/n). Denote by f. the Fourier inverse of g. Then the 
Fourier inverse of gy is fn (s) = nf (ns). Set fy into (44): 


© foo?) d= fam fosyas. 8) 


The measure b and the function g are nonnegative, and e(o) = 1 for joj < 1. 
Therefore the left side of (48) is greater than 


1 b(o)do. (48°) 


a 


On the other hand, according to (43), ja (s)| < a(0). Therefore the right side of (48) 
is less than 


a0) f nifaDids=a0 f Ifoas, 


a quantity independent of n. This shows that the integrals (48’) are bounded indepen- 
dently of n, proving (47). 

It follows from (47) that a (s) can be represented pointwise as the Fourier transform 
of b: 


a(s) = fas dm 


for every s, as claimed in (42). The uniqueness of a representation of form (42) 
follows from the uniqueness of the Fourier transform. 0 
Denote by P the set of positive definite functions a normalized by a(0) = 1. It 
follows from theorem 8 that the extreme points of P are the exponentials elas „o real. 
Thus (42) is seen as a Choquet-type representation of positive definite functions. 
Theorem 8 is easily extended to n dimensions; see Rudin’s book, Fourier Analysis 
on Groups. 
Laurent Schwartz has given the following extension of Bochner’s theorem. 


Definition. A skew-symmetric complex-valued tempered distribution a(s) on R is 
called positive definite if 


A THEOREM OF KREIN 147 


Jfa — Hol) b(t) ds dt>0 
for all CG° functions @. 


Theorem 8’. Every positive definite tempered distribution is the Fourier transform 
of a nonnegative measure of class S'. 


Schwartz has extended his theorem to R”. 


14.5 A THEOREM OF KREIN 
Definition. Let p be a continuous real-valued even function defined on R: 
p(—t) = p(t). 
p is called evenly positive definite if 
| fes —t)d(s)o(t)dsdt>0 © (49) 
for all real-valued, continuous, even functions of compact support: 


(=s) = o(s). 


Clearly, every even function of form (42) has this property. These functions can be 
written as 


E co 
p(s) = i. cosasdmi(a), dm > 0. 
0 


These are, however, not all. For all real A, and all even, real-valued continuous func- 
tions @ with compact support 


i J cosh A(s — Hpt) dsdt = | eè (s) ds | Moak 


-This shows that cosh As_is evenly positive definite, But then so is 


p(s) = f cost As dn(a), 
n any nonnegative measure for which the integral converges for all s. 
Similarly an even, real-valued function on R is called oddly positive definite if (49) 


holds for all real-valued, continuous odd functions ġ. Examples of such a function 
are — cosh As and superpositions. 


148 l EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 
M. G. Krein has proved the following result: E 

.. Theorem 9. Every real, even, continuous function p on R that is evenly positive- 
definite can be represented uniquely as 


? 


p(s) = cosas dm(c) +f cosh Às dn (à) 
i o r A AEE AE POE AE E N E E NE E E S E 0 > ar oian w- 0 eae AOE TET ate cee eee see RE Lt. 


m and n nonnegative measures. Similarly, every oddly positive-definite function can 
be represented as 


Seas HORSE NICET pe cosas dm(a) — : cosh ASAHAN TT 
0 - JO 


An easy consequence is 


Theorem 9’. Denote by P the set of evenly positive-definite functions, normalized 
by p(0) = 1. P is a convex set, and its extreme points are — 


cosas, o2>0, and coshas, A>O. 


For a proof we refer to Krein. 


14.6 POSITIVE HARMONIC FUNCTIONS 


In chapter 11, section 11.6, it was shown that every harmonic function A defined in 
the unit disk and positive there can be represented uniquely by Poisson’s formula 


l ; 1-r? 
hre) = | —— dm(0), 50 
re ) ; / L—2r. cos(x-—.6)+ r2. mag ) eer O ) 


m a nonnegative measure. 
It is easy to verify that the Poisson kernel has for r < J the following Fourier 
expansion: 


1-?? ie) fea 
=f rle ; (51) 


1—=2rcosi +r? 4 
Setting this into (50) gives the Fourier expansion of h: 
hlrex) = $ berlet, (52) 


where 


be = I eit dm(0). (52') 


POSITIVE HARMONIC FUNCTIONS 149 


We show now how to deduce Carathéodory’s theorem, theorem 7, from (50); this 
proof is due to Herglotz. Let {a,} be a positive definite sequence in the sense of (31). 
It follows then from (33) that the sequence a, is bounded; therefore the series 


co 
k(re!X) = $ ay rit etx (53) 
—00 


converges forr < 1, uniformly forr < 1 — ô. Clearly, k is a harmonic function of x, 
y, where x + iy = re'X in the unit disk. We claim that k is positive. To see this, we 
rewrite the left side of (31) by introducing n — k = £ as a new variable. We get 


> oae) bnOy—¢ = 0. (53") 
[4 n 
We want to choose {$n} so that for all @ andr < 1, x given 
> bnn = rll elt, (54) 
n 


To satisfy (54) we multiply it by e~!£? and sum over £. We get 


» NO bnGn—ee = Do pid eit x-0), (54^) 
e n £ 


The left side can be written as 


oie S Ue = | y dne ir? P; 


the right side, by (51), is the Poisson kernel, which is positive. Therefore we can set 


1/2 
v6 pind 1-7? (55) 
i 1 — 2r cos(x — 0) +r2 l 


This choice of {$n} satisfies (54’), fromi which (54) follows. Setting (54) into (53^ 
shows that k(re!X) is positive forr < 1. 

Once k has been shown to be positive, we can appeal to the Herglotz-Riesz theo- 
rem, theorem 6 in chapter 11, and obtain a representation of form (50) for k. As was 


shown above, this in turn gives formula (52’) for the coefficients ap in the Fourier.. 


expansion of k. This is the desired formula (32). g 


Denote by H the linear space of real-valued harmonic functions in the open unit 
disk. Denote by P the subset of positive harmonic functions h, normalized by 


A(O) = 1. (56) 


Clearly, P is a convex subset of H. From the uniqueness of the measure in represen- 
tation (50), we deduce, as in earlier sections of this chapter, that all extreme points 


~ 


a 


150 EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 


of P are of the form’ 2 
Te E E 
. Lr’. 


e.= —— ae ees Sele) 


This shows that the Herglotz-Riesz representation of positive harmonic functions is... 


a Choquet-type representation. 


~ Exercise 5. We impose on H the weakest topology that makes continuous all linear 


functionals 


sn atf = hz). zi <i. BRR a cami (58) 


Show that the convex set P defined above is compact in this topology. (Hint: Use 
Harnack’s theorem on positive harmonic functions.) 
14.7 THE HAMBURGER MOMENT PROBLEM 


A sequence of real numbers ag, a}, ... is called positive in the sense of Hankel 
(H positive) if 


X an+kEnbk >0 (59) 
n,k 


for all finite collection of real numbers £n, n = 0, 1,..., N. 
Let m be a nonnegative measure on R all of whose moments are finite. 


f t" dm(t) < 00, n=0,1,.... (60) 
R i 
Define 

a= f tfdm(t), £=0,1,.... (61). 


We claim that this sequence is H positive, for 


Sangene = | Som the, g dm(t) = | (om) dm(t)>0. (62) 


nk n,k 


Conversely, Hamburger has proved: 
Theorem 10. Every H positive sequence {an} can be represented in the form (61). 


For proof, we refer to chapter 33. An interesting fact is that there are H positive 
sequences that can be represented in form (61) in only one way, as well as others that 


G. BIRKHOFF’S CONJECTURE 151 


have several distinct representations. Why this is so will be explained in chapter 33 
on self-adjoint operators. 

Denote by Hp the set of all H positive sequences normalized by ap = 1. It follows 
from theorem 10 that every extreme point e of Ho is of the form 


erlt) = t*, kin Z4; t real. (63) 


It is not hard to show that conversely, every sequence e(t) of form (63) is an extreme 
point of Hg. Thus (61) is a Choquet-type representation of the set Ho. 

Examples of Hankel positive sequences: 

Take in (61) as 


dm(t) t-l ford<1<1,5>0 
dt | 0 otherwise. a 
Then 
l l 
a= f or! dt = - ` 
0 £+6 
Thus 


Engk l Š 
e sara a 


for all real &,,. 
In conclusion, we note that theorem 10 fails to hold in more than one variable. 


14.8 G. BIRKHOFF’S CONJECTURE 


Definition. Ann x n matrix S = (s;;) is called doubly stochastic if 


(i) all entries are nonnegative, 


sij = 0. (66) 
(ii) all row sums and column sums are equal to 1, 


ae aids trent sk, Sats yay ii Es ys) = | for all i resp. j. l 
j ES 


It is obvious that the set D of all doubly stochastic matrices forms a convex set 
2 
in R”. 
A permutation p of n objects is a one-to-one map of the indices 1,...,n onto 
themselves. The associated permutation matrix P is defined by 


“it 


152 EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 


Clearly, each row, and each column of a- permutation matrix: P; -contains ‘exactly o one- 
entry equal to 1 and all others are zero. This shows that each P is doubly stochastic, 
that is, P € D. We claim that each P is an extreme point of the set D. To see this, 
suppose that P were the midpoint « of an interval, whose endpoints 


(Sta oo E == PEQ 


both belong to D. Clearly, it follows from (66) that if p;; = 0, then gj; = 0, and 
from (66' ) that if pj; = 1, then gj; = 0. Since.all entries of P are either O-or 1; it 


follows that Q = 0: This ‘proves that P- isextreme nmm =>" za 
Conversely, D. König and G. Birkhoff have shown that all extreme ere of D 
are permutation matrices P. 


Exercise 6. Prove the K6nig-Birkhoff theorem. 
By Carathéodory’s theorem, it follows that all doubly stochastic matrices are 
convex combinations of permutation matrices. This representation, however, is not 


unique in general. 


Definition. Ann x n matrix S = (s;;) is called doubly substochastic if 


(i) all entries are nonnegative, 
Sij = 0. (67) 


(ii) all row sums and column sums are <i, 
aie \ sj sl (67°) 
r 4 


for all i, respectively j. 


We denote the set of all doubly substochastic matrices by Dg. Clearly, Do is a 
convex set that contains the set D. We call a matrix Po a subpermutation matrix if 
its entries are either 0 or 1, and if each row and column contains at most a single 
entry 1. Every Po belongs to Do. The argument used above to show that every P is 
an extreme point of D can be used to prove that every Po is an extreme point of Do. 
Conversely, 


Exercise 7. Show that all extreme points of Do are subpermutation matrices Po. 


We turn now to infinite matrices $ = (s;;), i, j in Z+. The notions of doubly 
stochastic, doubly substochastic, permutation, and subpermutation matrices are de- 


G. BIRKHOFF’S CONJECTURE 153 


fined exactly as in the n x n case. We denote by X the linear space of all matrices S 
with real entries whose rows and columns have uniformly bounded g! norms: 


sup) | |sij| < 09, sup > [sij] < œ. (68) 


We deal first with doubly substochastic matrices. As topology we use the coars- 
est topology in which all the linear functionals that map the matrix S into its ijth 
component are continuous: 


lij (S) = sij. (69) 


We recall the following result from chapter 13, exercise 3: the only linear func- 
tionals in X that are continuous are finite linear combinations of the £;;. 


Theorem 11. 


(i) The extreme points of the convex set Do consisting of all doubly substochastic 
matrices is the set {Po} of subpermutation matrices. 


(ii) Do is the closure of the convex hull of {Po} in the topology induced by the 
functionals (69). 


Proof. We first prove part (ii). Suppose, on the contrary, that some doubly sub- 
stochastic matrix Z does not belong to the closure of the convex hull of {Po}. Ac- 
cording to theorem 2’ of chapter 13, there would exist a continuous linear functional 
£ such that 


£(Z) >c, (70) 
but for all T. in the closure of the convex hull of {Po}, 
&(T) <c. i (71) 


We may, in particular, set T = any subpermutation matrix Po: 


£(Po) <c. (71') 


The only linear functionals € that are continuous in the coarsest topology that makes 


Ba bjj s * (72) 

i,j<n 
Denote’ by Sp the projection- of-any-substochastic-matrix-S-onto-the-n-x-n matrix 
formed by the intersection of the first n rows and columns of S. Clearly, Sn is a 
doubly substochastic (n x n) matrix. Then it follows from (69) and (72) that for 


154 EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 


any Ss. 


£(S) = £(S,). ae (73) ` 


Denote the projection of Z by Zn. As remarked earlier, the extreme points of the 
set of doubly substochastic n x n matrices are the n x n subpermutation matrices. 


It follows from Carathéodory’s-theorem that on a compact.convex set, a-continuous.. ..... 


linear functional takes.its maximum-one.of.the-extreme points. We have that... = 


£(Zn) < sup £( Pr), (74) 


where the Py are n x n subpermutation matrices. Such a Py, is thé projection of a~ ~~ 


subpermutation matrix Po of infinite order whose elements not in the first n rows and 
columns are set = 0. By (73), 


£(Py) = £(Po) and £(Z,) = £(Z). (75) 
Combining (74) with (75), we obtain 
£(Z) < sup £(Pp). 
Po 
This combined with (71’) shows that (70) cannot hold. Therefore every Z in Do 
belongs to the closure of the convex hull of {Po}. 


To show that, conversely, all points of the closure of the convex hull of {Po} belong 
to Do, we rewrite the criterion (67) and (67') for belonging to Do as follows: 


£j(S)20, | (76) 
Sasi, > ey Gye. (76') 
j<n i<n 


for all positive integers n. Since by definition of the topology, the functionals (76), 
(76’) are continuous, and since these inequalities hold on the convex hull of {Po}, it 
follows that they hold on its closure. This completes the proof of part (ii). 

The proof of part (i) is based on theorem 6 of chapter 13, which says that the 
extreme points of the closure of the convex hull of a set S belong to S, provided both 
sets are compact. In order to apply that theorem to § = {Po}, we have to verify that 
both Do and {Po} are compact sets. To see this, we note that the topology we have 
imposed is the weak product topology 


[Isy 


The entries of S in Do lie in [0, 1], so Dg is a subset of 


| [t0. 1]. 


G. BIRKHOFF’S CONJECTURE 155 


According to Tychonov’s theorem, this is a compact set. We have shown already in 
part (ii) that Do is a closed set; being a closed subset of a compact set makes Do 
compact. 

Similarly, in order to show that { Po} is compact, it suffices to show that this set is 
closed. The matrices Po are characterized by the inequalities (76’) and 


li; (S) € {0, 1). 


Each of these sets is closed; therefore so is their intersection { Po}. 

We now appeal to theorem 6 of chapter 13. It states that given a compact set such 
as {Po} whose closed convex hull—which by part (ii) of theorem 11 is Do—is also 
compact, then all extreme points of the closed convex hull belong to the original 
compact set. This completes the proof of part (i) of theorem 11. O 


Theorem 12. 


(i) Every extreme point of the set D of doubly stochastic infinite matrices is a 
permutation matrix P. 


(ii) D is the closure of the convex hull of the set {P} of permutation matrices in 
the coarsest topology that makes continuous the linear functionals €;;, li, and 
e/, where £i; are defined by (69) and i 


=J sj CO = Y sij. 
i. i 
Proof. (i) We start by showing that D is an extreme subset of Do. Suppose that S 
in D lies on an interval 
S=aT+bR, TandRinDo, at+b=1, 0<a,0<b. 
Form the row and column sum of both sides: 


ay =a} tj +b}) rij Ysy =a) ty tod ry. 
j j i i BOIAR le 


Since S belongs to D, the sums on the left are = 1; since T and R belong to Dp, the 
sums on the right are < 1. Since the two sides are equal, the sums on the right must 
all be = 1. But this means that T and R belong to D and this proves that D is an 


“extreme subset of Do. 
We noted in theorem 7 of chapter I that being 
relation among convex sets. Thus: an-extreme point Æ of D is an extreme point of 
Dg. According to part (ii) of theorem 11, all extreme points of Do are subpermutation 
matrices Po, so E = Po. Since E belongs to D, it follows that E is not a sub- but a 
genuine permutation matrix P. 


| extreme subset is a transitive “°° 


This completes the proof of part (i) of theorem 12; part (ii) can be proved along ~~~ 


the lines of the argument -presented for part (ii) of theorem 11. Note, however, that 


156 EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 


part (ii) cannot be. proved. by- appealing to the Krein-Milman theorem, since the set. - 
D is not closed, and therefore not compact. 

Theorem 12 was conjectured by Garrett Birkhoff. The preceding theorems and 
proofs are due to Kiefer and Kendall; see D. G. Kendall. o 


The setting of probability theory is a space Q; itt which aG-algebra X is’specified. ~ 
The sets in X represent all possible events; a probability. measure on X represents 
the probability of their occurrence. An infinite sequence of occurrences is modeled 
by the direct product Z x £2. The events in Z x {2 form the smallest o-algebra that _ 


contains all the cylinder sets, formed by the product sets 


[]4:. 
where E j belongs to the o-algebra © in Q, and all but a finite number of the sets E; 
are the whole space Q. 


A probability measure m on the cylinder sets of Z x Q is called invariant under 
permutations if for all cylinder sets l 


m qI Ej) =m (I Ep) ; 


where j —> p(j) is a permutation of the indices, such that p(j)-= j for all but a 
finite number of j. 

The set of probability measures on Z x Q that are invariant under permutations is 
clearly a convex set, where convex combinations of measure is defined in the obvious 
way. 

A probability measure m on the o-algebra X in Q induces the product measure 
on the cylinder sets of Z x Q by the formula 


m qI E;) = [| m(£)). 


Since all but a finite number of the E; are equal to Q, all but a finite number of the 
factors on the right are equal to 1. Clearly, the product measure on Z x Q induced by 
a measure on Q is invariant under permutation. 

De Finetti proved the following important result: 


Theorem 13. 
(i) The extreme points of the set of permutation invariant probability measures on 
Z x Q are the product measures. 


(ii) Each measure on Z x Q invariant under permutation can be expressed in a 
unique fashion as an integral over the product measures. 


Clearly this is a Choquet type of result. For a proof, see de Finetti or any advanced 
text on probability. 


MEASURE-PRESERVING MAPPINGS 157 
14.10 MEASURE-PRESERVING MAPPINGS 


In this section 2 denotes a compact metric space, T a homeomorphism of & onto Q. 
It can be shown that there exists at least one probability measure on the Borel subsets 
of Q that is invariant under T. There may be many. The collection of all invariant 
probability measures form a convex set. The following result of John Oxtoby sheds 
light on the structure of this collection: 


‘Theorem 14. 


(i) The extreme points of the convex set of probability measures invariant under 
T are those measures with respect to which T is ergodic. 


(ii) Every invariant measure can be represented as an integral over the ergodic 
measures. This representation is unique. 


Proof, We will sketch the proof of part (i). We recall the definition of ergodicity: 
a mapping T is ergodic with respect to a measure m on & if it is not possible to 
decompose & into two parts, 


Q = Q1 U Qa, 
both of positive measure, 
m(Q1)>0, m(Q2) > 0, 


so that both Q; and Q3 are invariant under T. 

Suppose now that m is invariant under T but that T is not ergodic with respect 
to mt. Then there is a decomposition of Q as above. We define two new measures m | 
and mə as the restrictions of m to 91, N2, respectively. That is, for any Borel set S, 


` m(SAQ)) 
m,(S) = an 
_ m(S N R2) 
m(Q2) ` 


mo 


Clearly, mı and mz are probability measures, and they are invariant under T. ‘The..-.. 
measure mm is a convex combination of them: 


m = m(Q mi + m(R2)m2. 


Since mı # mz, this shows that if m is not ergodic, it is not an extreme point. 
Conversely, we show that if m is not an extreme point, it is not ergodic. Suppose 
that 


tte 


158 ` EXAMPLES OF CONVEX SETS AND THEIR EXTREME POINTS 


m=am,+(1—a)may 7 0 <a < t; my Amz. 


We first take the case that m; is absolutely continuous with respect to mz. By the 
“~~~ Radon-Nikodym theorem, 


"my = fma, __f nonnegative and in L' (m2). 


~-~-==—-Since-both-m ;~and-my~are invariant under*T';"so isf- Since m, Æ m2, f Æ 1; 


-———therefore-there-exists-a-positive-number-e-such-thatthe-sets S4-=-4w]-f-(w)->-c}-and ~- 


Qo = {wl f (w) < c}, both having positive ma 'measure: Since f is invariant .under 
T, T maps 2; and Q3 onto themselves. 
-- Substituting m` = fm into the expression form. as convex combination of m] 


oem m2 gives” 


> 


m = [af +1 — alm. 


It follows that 2; and Q2 have positive m measures; this shows that T is not ergodic 
with respect to m. 

The case when m; is not absolutely continuous with respect to m2 is just as simple. 
Then there exist sets E whose ma measure is zero, but mı (E) > 0. Denote by s the 
quantity i 


s = supm; (E), m(E)=0. 
Let En be a maximizing sequence: 
limmı (En) =s, m(En)=0. 


Denote by F the union of the Ep. Clearly, m; (F) = s, ma (F) = 0. It follows that 
F is invariant under T; for if not, the set (F U T F) would have mi measure greater 
than m(F) = s but m2 measure 0 contrary to the definition of s. 

We claim that in. the decomposition Q = F U F°, both pieces have positive m-. 
measure. For, using the expression of m as linear combination of mı and m2, we 
get 


m(F) > am,(F) = as 
and 
m(F°) > (1 — a)mo(F*) = (1 —a). 


This shows that T is not ergodic with respect to m. 
For the proof of part (ii), see the article by Oxtoby. 0 


NOTE. In their work on X-ray crystallography, honored by the Nobel Prize in physics 
in 1986, H. Hauptman and J. Karle made crucial use of Toeplitz’s characterization 
(32) of the Fourier coefficients of positive measures. 


BIBLIOGRAPHY 159 


HISTORICAL NOTE. Dénes König (1884-1944), professor at the Technical Univer- 
sity in Budapest, was the founding father of graph theory. He developed many of 
the basic concepts, and wrote the first book on the subject in 1936. His proof of the 
Birkhoff-K6nig theorem is graph theoretical. The brilliant Hungarian school in graph 
theory is his legacy. 

König supervised the Eötvös mathematical competitions for high school students. 
He was extremely kind and encouraging to budding young mathematicians, includ- 
ing the writer of these pages. 

When the German army occupied Hungary in 1944, putting Hungarian Nazis in 
power, König saw what was coming and threw himself out the window of his apart- 
ment. 


BIBLIOGRAPHY 


Akhiezer, N. I. The Classical Moment Problem and Some Related Questions in Analysis (English trans.). 
Oliver and Boyd, Edinburgh, 1965. 


Bernstein, S. Sur les fonctions absolument monotone. Acta Math., 52 (1929): 1-66. 
Birkhoff, G. Three observations on linear algebra. Rev. Univ. Nac. Tucuman (A), 5 (1946): 147-151. 
Bochner, S. Vorlesungen tiber Fouriersche Integrale. Akademische Verlagsgesellschaft, Leipzig, 1932. 


Carathéodory, C. Uber den Variabilitétsbereich der Koefficienten von Potenzreihen die gegebene Werte 
nicht annemen. Math. An., 54 (1907): 95-115. 


de Finetti, B. Funzione caratteristica di un fenomeno aleatorio. Atti Accad. Nax. Lincei Rend. Cl. Sci. Fiz. 
Mat. Nat., (1930): 86-133. 


Gelfand, I. M. and Do-Shing, S. On positive definite distributions. Usp. Mat. Nauk., 15 (1960): 185-190. 


Hamburger, H. Uber eine Erweiterung des Stieltjesschen Moment Problems. Math. An., 81 (1920): 235- 
319; 82 (1921): 120-164, 168-187. 


Herglotz, A. Uber Potenzreihen mit Positiven Reellen Teil in Einheitskreise. Berichte Verh Stichs, Akad. 
Wiss, Leipzig, Math.-phys. KL, 63 (1911): 501-51 L. See also Collected Works, Vandenhoeck and 
Ruprecht, Göttingen, 1979. 


Kendall, D. G. On infinite doubly stochastic matrices and Birkhoff’s problem III. London Math. Soc. J., 
35 (1960): 81-84. * 


` König, D. Theory of Finite and Infinite Graphs. Täubner 1936; Birkhauser, Boston, 1990, p. 327. 


Krein, M. G. On a general method of decomposing Hermite-positive nuclei into elementary products. 
Dokl. Akad. Nauk SSSR, 53 (1946): 3-6. 


Landau, H. J. Classical background of the moment problem. Proc. Symp. Appl. Math, AMS, 37 (1987): 


ETE 3 2 ee Te pies See aie ia 
Oxtoby, J. Ergodic sets. Bull. AMS, 58 (1952): 116-1 36. 
Rudin, W. Fourier Analysis on Groups. Interscience, New York, 1962. Ries 


Shohat, J. A. and Tamarkin, J. D. The Problem of Moments. American Mathemtical Society, New York, 
1943. 


Schwartz, L. Théorie des Distribution’s. 2 vols. Hermann, Paris, 1959. 


Toeplitz, O. Uber die Fouriersche Entwickelung positive Funktionen. Rend. di Circ. Mat. di Palermo, 32 
(1911): 191-192. 


naps 


~ BOUNDED LINEAR MAPS | 


In chapter 2 we studied some rudimentary properties of linear maps M of one linear 
space into another. Here we impose topological structures on the linear spaces and 
on the mappings themselves. Alternative names for maps, and synonymous with it, 
are operator and transformation. 


15.1 BOUNDEDNESS AND CONTINUITY 


Definition. X and U are a pair of Banach spaces. A linear map (actually any map) 
M:X >U 
is called continuous if it maps convergent sequences into convergent ones, that is, if 
Xn —> x implies Mx, — Mx. (1) 
Here convergence is reckoned in the sense of the norm in X and U, respectively. 


Definition. A linear map M : X —> U of one Banach space X into another U is 
called bounded if there is a constant c such that for all x in X 


[Mx] < cla]. (2) 


Theorem 1. A linear map M : X — U of one Banach space X into another U is 
continuous if and only if it is bounded. 


Proof. It is easy to show that a bounded linear map is continuous, even Lipschitz 
continuous. 
Conversely, if M were not bounded, (2) fails for any c, say n, for some x, say xn: 


|Mxn] > njxnl. 


Normalize x, so that |xn| = 1/./n; xn tends to zero but Mx, does not. Clearly, (1) 
is violated, so M is not continuous. o 


160 


BOUNDEDNESS AND CONTINUITY 161 


Suppose that the spaces X and U on which and into which M acts are merely 
normed linear spaces, not complete, and suppose that M is bounded in the sense of | 
(2). Then M can be extended by continuity to a bounded mapping of the comple- 
tion of X into the completion of U. This observation is as trivial as it is important, 
since most maps of interest are constructed in the fashion described above, being 
first defined in an incomplete space and then extended by a flick of the wrist to the 
completed space. The incomplete space usually consists of smooth functions, the 
complete space of functions less smooth, or not at all smooth. 


Definition. Let M : X —> U be a bounded linear map of one Banach space into 
another. Its norm, denoted as |M], is defined by 


M 
M ap a, (3) 
x0 || 
Clearly, for any x in X, (2) holds with c = |M]: 
[Mx| < |MIlxl. (2) 


Equally clearly, IMI is the smallest value of c for which (2) holds for all x in X. A 
useful reformulation of (3) is 


IMI = sup |Mxl. 0 8) 


Įx]=1 
Theorem 2. Norm of bounded maps has the following properties: 


(ii) Homogeneity, for any scalar, real, or complex, |aM| = |a||MI. 
(ii) Positivity, |M| > 0, |M] = 0 ifand only if M = Q. 
(iii) Subadditivity, |M + K| < |M] + IKI. 


Proof. Properties (i) and (ii) are obvious. O 


Exercise 1. Prove property (iii). 


Definition. The set of all bounded maps of one Banach space X into another U is 
-————denoted-by—— a s Seah nical a a he 


L(X, U). 
Theorem 3. L£(X, U) is a Banach space under the norm (3). 
-—- -—Proof—Properties (i),-(ii),-and-(iit)-in theorem 2 show that £(X, U) forms a normed 


linear space under the norm (3). What remains to be shown is the completeness of 
L£(X, U). 


se 


162 AS BOUNDED LINEAR MAPS 


Let {Mj} be a Cauchy sequence in £(X, U): 


lim |M, — M;| = 0. (4) 


n,k— œ 


It follows from (4) that for any x in x 


Fete tate ene Poet oe oe] - 
n,k- œo 


[Myx - Mix] =0. (4) 


This shows that {M;,x} is a Cauchy sequence in U; since U is complete, the limit 
lim M,,x = u exists. We define the mapping M to be Mx = u; clearly, M is linear. 


By definition of a norm, 


IM, -M| = sup [Myx — Mx| = sup lim |M,x —M,x| < sup |My, — My]. 
- o0 


|x|==1 |x|=1 k n<k 
Using (4) it follows that |M, — M| — 0. g 


In the special case is where the target space U of the linear mappings is one- 
dimensional, that is, isomorphic with R or C, the bounded linear maps are bounded 
linear functionals, and L(X, U) is just the dual space X’ of X. 

We recall from chapter 2 the notion of the nullspace Ny of a linear map M : 
X — U; it consists of all points x of X mapped into 0 by M: 


Mx =0. (5) 


Theorem 4. Let X and U denote normed linear spaces, M : X — U a bounded, 
linear map. 


(i) Nm. the nullspace of M, is a closed linear subspace of X. 
(ii) M, when regarded as a map 


X 
Mo: (5) — U 
NM 
is one-to-one, bounded, with |Mo| = |M]. The range of Mg is the same as the 
range of M. 


Proof. G) Nm is the inverse image in X of {0} in U. Since {0} is a closed set, and 
M is continuous, Ny is closed. 

(ii) xı and x belong to the same equivalence class mod Ny if x} — x2 € NM. By 
(5), and linearity, Mx; = Mx; therefore the mapping Mo is unequivocally defined. 
We recall from chapter 5 that the norm in the quotient space X/N is defined as 


I(x}[= _inf yl. (6) 


y=x mod N 


__.Theorem 5. _ 


BOUNDEDNESS AND CONTINUITY 163 


We have shown there, in theorem 1, that if NV is closed, as Nm is, then the quantity 
|{x}| is a norm. Using the definition (3) of the norm of a map, and some obvious 
manipulations, we have 


M: Mo Mx M{: 
MI = sup © Be. s DO su EL su Mia = |Mp|. O 


= sup sup = area 
x0 lxl  tqyeovex IY] (x}40 mi] g0 lx) 


We turn to defining the transpose of a bounded linear map M : X —> U, X and 
U normed linear spaces. Let £ be a point of U’, the dual of U; that is, £ is a bounded 
linear functional on U. The composite (Mx) is a linear and bounded functional of x: 


(Mx) = &(x). (7) 
The linear functional £ € X’ clearly depends linearly on £: 
E =M. (7) 


M : U’ = X' is called the transpose of M. 

The transpose of a bounded linear map is the infinite-dimensional generalization 
of the transpose of a matrix; it is an enormously useful concept. In studying and 
using the transpose, it is convenient to denote the action of the linear functionals by 
parentheses as follows: 


€(u) = (u, 2), Elx) = (x, £), 


where u e U, € U’, x € X,€& e X’. In this notation the relations (7), (7^) defining 
the transpose can be rewritten as 


(Mx, £) = (x, M8). (8) 


We recall from chapter 8, (see theorem 7) the definition of the annihilator R+ of 
a subspace R of a normed linear space U as the subspace R+ of U’ consisting of all 
bounded linear functionals £ that vanish on R. Similarly, for any subset S of X’, we 


define S as the subset of those vectors in X that are annihilated by every vector € in” 


S. Clearly, S+ is a closed linear subspace of X. The basic properties of transposition 
are summarized in 


© == (i) “The transpose M! of a bounded linear map M is bounded, and 


[M’| = |M]. (9) 


(ii) The niillspaée of M” is the aninihilaror of the range of Mo 


Nw = Ra (10) 


& 


` 164 BOUNDED LINEAR MAPS 


~~ (iii) The nullspace of M is the annihilator of the range of M';, — 


Setting £ = M’£ into (13) and combining this with (12), we get, using (8), that 


[M'| = sup sup |x, M'O] = sup |(Mx, £). (12) 
| 


[f=] |xf=l lä=1=|x 


According to theorem 6 of chapter 8, for every u in U, 


= max |(u, £)|. 
|u| T JI (14) 


On the right side of (12’) we maximize first with respect to £. Using (14), with u = 
Mx, we get 


|M’| = sup |Mx], 


|x|=1 


which by (3’) equals |M]; this proves (9). 

To prove (10), we note that for any x in X and any £in Nyy, the right side of (8) 
is zero. Therefore so is the left side; this shows that Nw C Rọ Conversely, if £ 
annihilates the range of M, the left side of (8) is zero for every x. Therefore so is the 
right side, which can only be if M’£ = 0. This shows that Nu D Rh: these two 
relations taken together prove (10). 

To prove (11), we note that the left side of (8) is zero when x belongs to the 
nullspace of M. This shows that every x in Ny belongs to the nullspace of every 


E = M, Lin U’. Conversely, suppose that x belongs to the nullspace of all such £; 


then the right side of (8) is zero for all £ in U’. But then so is the left side; this can 
be only if Mx = u = 0, that is, if x belongs to Ny. This proves (11). 
Part (iv) is obvious. o 


Exercise 2. Let X and U be Banach spaces, U reflexive. Let M be a bounded linear 
map: X —> U. Let xn be a sequence in X weakly convergent to x. Then Mx, 
converges weakly to Mx. 


Teea = jae NE Ra pa (11) 
Proof By Ghaid, O O 
mE [Me] aj 
By definition, the norm of £ in X! is | 2 
A ¥l= sup Ha, E). (13) 


STRONG AND WEAK TOPOLOGIES 165 


Exercise 3. Denote by I the identity map X —> X. Show that I’ is the identity map: 
X = X!. 


In a complex Hilbert space the notion of transpose is replaced by adjoint, defined 
by the analogue of (8) and denoted by an asterisk: 


(Mx, y) = (x, M*y). 
For matrices, the adjoint is the conjugate transpose. 


Exercise 4. Show that theorem 5 is valid for the adjoint operation. 


15.2 STRONG AND WEAK TOPOLOGIES 


The norm of linear maps X —> U defines.a metric topology in £L(X, U) that is 
sometimes called the uniform topology, in deference to two other topologies that are 
also useful and therefore important: 


Definition. The strong topology in L(X, U) is the weakest topology in which all 
functions L —> U of the form M —> Mx are continuous, x being any point of X. 


Definition. The weak topology in L(X, U) is the weakest topology in which all lin- 
ear functionals of the form M —> (Mx, £) are continuous, x being any point of X 
and £ any point in U’. 


Exercise 5. Define the weak* topology in £(X, U’), X, U Banach spaces. Show 
that there is a natural one-to-one correspondence between £(X, U’) and L(U, X D, 
and that this correspondence is continuous in the weak™ topology. 


Of equal importance are the corresponding notions of sequential convergence: 


Definition. A sequence {My}. of bounded linear maps: X — U, X;U Banach-—--~- ~ 
spaces, is called strongly convergent if 


s— jim. Myx (15) 


exists for every x in X. 
{Mn} is called weakly convergent if 


w— lim Myx (15) 
Bae chet ded Laie EE e YS a a 


exists for all x in X. 


166 : BOUNDED LINEAR MAPS 


Itis easy to show, and is left as an exercise, that a strongly or weakly convergent 
sequence of maps has a limit M, in the sense that (15); (15‘) are equal to Mx- We——— 
will denote these relations as s — limM, = M and w —limM, =M. ... ee ee 
Exercise 6. Prove that if w — lim M, = M, then w — lim Mi, = M’ provided that X 
is reflexive. (Hint: Use the definition of weak convergence; see (18) below.) 


- Nosuchresult holds -for-strong-convergence;-take-X and-t/-both-to-be the Hilbert 
space £? (see chapter 6) consisting of vectors 


x= @a,...), le? = 9 lait’. 


Define M, to be 


Myx = (an, 0,0, ...). 


It is easy to see that s —limM, = 0. Since £? is a Hilbert space, it is self-dual; 
take £ = (b1, b2, ...); the relation (Mpx, £) = anb = (x, Mj,£) shows that M, £ = 
(0,..., b),...). Clearly, s — lim Mj, £ does not exist unless b; = 0. 

The significance of these notions is that maps of interest are often—one is tempted 
to say usually—constructed as limits, uniform, strong, or weak of sequences of ap- 
proximate maps. The following result, as important as it is trivial, is used all the 
time: z= . 


Theorem 6. Let X, U be Banach spaces, M, a sequence of linear maps: X —> U, 


_ uniformly bounded in norm: 


|IMn| <c for alin. (16) 
Suppose further that 
iez lim Myx 


exists for a dense set of x in X. Then {M,} converges strongly, i.e. the s-limit exists 
for all x in X. 


Exercise 7. 

(a) Prove theorem 6. 

(b) Formulate and prove an analogous theorem for weak convergence. 
15.3 PRINCIPLE OF UNIFORM BOUNDEDNESS 


Uniform boundedness turns out to be not only convenient for proving strong or weak 
convergence, but it is necessary as well. 


COMPOSITION OF BOUNDED MAPS 167 


Theorem 7. Let X and U be Banach spaces, {M,} a collection of bounded maps 
X — U, such that for each x in X and each £ in U', (Myx, £) is bounded by a 
constant that only depends on x and £: 


(Mx, D| < c(x, £) for all My. (17) 
Conclusion: {My} is uniformly bounded, meaning that (16) holds. 


Proof. We appeal to the principle of uniform boundedness, theorem 4 of chap- 
ter 10: If {uy} is a collection of points in a normed linear space U such that for every 
linear function £ in U’, |€(uy)| < c(@) for all uy, then there is a constant c such that 
juy| < c. We apply this result to uy = Myx, and conclude that for all v, 


[Myx] < c(x). (175 


Next we appeal to theorem 2 of chapter 10: If {fp} is a collection of real-valued, 
continuous, subadditive, and positive homogeneous functions defined on a Banach 
space X, and if at each point x of X, fy(x) < c(x) for all v, then there is a number 
c such that 


fue) < chx] for all x, all v. 


We identify the functions fy with f(x) = |Myx{. Clearly the fy are homoge- 
neous, subadditive, and continuous. According to (17’) above, the fy (x) are bounded 
at every point. Therefore the fy are uniformly bounded, which in our case means that 
|M,,x| < c|x| for all x in X, as asserted in (16). o 


The relation w Be lim M, = M means that (15’) holds for every x in X, which in 
turn means (see definition 1 of chapter 10) that 


lim (Myx, £) = (Mk, £) (18) 
n> OO 


for all £ in U’ and all x in X. Since a convergent sequence is bounded, it follows that 
condition (17) of theorem_7 is satisfied; therefore_by.theorem.7, the sequence M, is 
uniformly bounded. 


Corollary 7’. A weakly convergent sequence of maps of one Banach space into an- 
otheris-uniformly-bounded. 


15.4 COMPOSITION OF BOUNDED MAPS 


We turn now to the discussion of composition, called the product, of a map M : X > 
U_with another.map.N__U_->_W. This operation was_studied_in chapter 2 from the 
point of view of linear algebra. Here we study some further properties of it in case 
X, U and W are Banach spaces, and M and N are bounded linear maps. 


168 BOUNDED LINEAR MAPS 


Theorem 8. Let X, U f w denote Banach spaces, M and N bounded linear maps, 
M:X>U, N:U >V. 


Then the composite NM is a a bounded linear map: X — W with the following prop- 
m one = CITES! 


io ) Submultiplicativity INMI < < IN Mi. 
~ ü) (NM) =M’N 


Proof. Applying inequality (4) twice, we get 


 INMx] < IN| Mx] < <INII Mi Ix]. 
Applying definition (3), we get 


INMx| 
|x| 


We turn to (ii): applying (8) twice, we get 


INMI = sup 


< IN| IMI. (19) 


(NMx, m) = (Mx, N'm) = (x, M'N'm). (20) 
0 


Exercise 8. Prove that multiplication of maps is a continuous operation in the strong 
topology on the unit balls of £(X, U) and L(U, W). 


Definition. Two maps A and M ofa linear space X into itself are saa to commute if 
AM = MA. 


Exercise 9. Let X denote a Banach space, A a bounded map: X —> X that com- 
mutes with each of-a collection. {M,} of bounded maps X —> X. Show that then A 
commutes with every map M that lies in the closed linear span of the set of maps 
{M,} in the weak topology. 


Exercise 10. Show that in a complex Hilbert space (NM)* = M*N*. 


15.5 THE OPEN MAPPING PRINCIPLE 


The next group of results, the open mapping principle, and the closed graph theorem, 
goes considerably deeper than the foregoing material. These ideas are due to Stefan 
Banach; their validity is far from being intuitively clear at first glance, or even a 
second one. 


Theorem 9. X and U are Banach spaces, and M : X —> U a bounded linear 
mapping of X onto all of U. Then there is ad > O such that the image of the open 


THE OPEN MAPPING PRINCIPLE 169 


unit ball in X under M contains the ball of radius d in U: 
MB: (0) D By(0). (21) 


Proof. Denote by By the open ball of radius n'around the origin in either the space 
X or U. Since M is assumed to map X onto U, and since the union of all the By is all 
of X, it follows that UMB, = U. Since the Banach space U is complete, it follows 
from the Baire category principle that at least one of the sets MB, is dense in some 
open set. Some translate of this set is dense in some ball around the origin; since the 
range of M is all of U, by linearity of M we take that translate to be of the form 
M(Bn — xo). The set By, — xo is contained in the ball of radius n + |x| around the 
origin. So by homogeneity of M, we conclude that MB) (0) is dense in B,(O) for 
some r > 0. Consequently for any c > 0, 


M8,(0) isdensein Ber(0). (22) 


We want to show now that any point u in B,(0) is the image of some point x in 
B2(0): 


Mx =u. (23) 


This point x in B2(0) is constructed as an infinite series 


The terms x; are constructed recursively: x is taken as a point satisfying 
3 
lu — Mxı| < z7 lal < 1; (24a) 


by (22), with c = 1, there is such an xı. We choose x2 as a point satisfying 


5 - 1 
ju — Mix — Mag] <trah < 5... (24) 


it follows from (22), with c = i, and (24a) that such an x2 exists. Generally, we 
choose xm to satisfy 


Sonor Steve sete see eee Mee a 
—— fF >My; 
i 


It follows from (22), with c = 1 /2"—! and (24c) that there is such an xm. 


< 


ni 
am? gmt 


[tml < =: (24c) 


We noted in chapter 5 on the geometry of normed spaces that if the sum of the _ 


norms J. |x;| of a series in a complete normed linear space X converges, the series 
> x; converges strongly. Since by (24c), |xj| < 1/2/-!, it follows that ee Xj 


E 


170 : BOUNDED LINEAR MAPS 


converges to a point x in X, and 


x 


hls Dbl < ga =2 (25) 


TNES 1 


Since M isa bounded map, letting m => œ in n (24c), we conclude that Mx = 


ar MS : EN ESE 


— —_—_Theorem-9-has-a- Te N NE consequences:the first-one-- 
is the open mapping principle: 


- Theorem 10. X and U-are Banach spaces, M:X—>Ua bounded linear map onto 
-—- all of U: Then M maps open sets onto open sets. ie 


This is an immediate corollary of theorem 9. 0 


Theorem 11. X and U are Banach spaces, M : X — U a bounded linear map that 
carries X one-to-one onto U. Alen the algebraic inverse of M is a bounded linear 
map of U + X. 


Proof. It follows from (21) of theorem 9 that for every u in U of norm d/2, there 
is an x in the unit ball of X such that Mx = u; note that |x| < 1 = 2|u|/d. Since M 
is homogeneous, it follows that for every u in U there is an x in X, such that 


Mx =u, |x| < 2|u|/d. (26) 
Since M is assumed one-to-one, x = M~!w. Clearly, from (26), |MT!| < 2/d. O 


Definition. A map M : X — U from one Banach space into another is called closed 
if whenever {xn} is a sequence in X such that 


Xn —> x and Mx >u si (27) 
then 


Mx =u. (27) 


If M is continuous, it is obviously closed. It is surprising but true that conversely, 
a closed linear map of a Banach space into another is continuous: 


Theorem 12. X and U are Banach spaces, M.: X — U is a closed linear map. 
Assertion. M is continuous. 


Proof. Define the linear space G to consist of all. pairs g of form 


g={x,Mx}, xin xX. (28) 


THE OPEN MAPPING PRINCIPLE ` 171 


We define the following norm for g in G: 
Ig] = |x| + [Me] (28’) 


Clearly, this is a norm. It follows from (27), (27^) and the completeness of X and 
U that G is complete under this norm. Define the mapping P : G + X to be the 
projection onto the first component, that is, 


g = {x, Mx}, Pg=x. (29) 


By definition (28^) of |g], |Pg| < |g|, meaning that P is a bounded operator, |P| < 1. 
Clearly, P is linear and maps G one-to-one onto X. Therefore, by theorem 11, the 
inverse of P is bounded; that is, there is a constant c such that c|Pg| > |g}. In view of 
the definition (29) of P and (28’) of |g|, it follows that (c — 1)|x| > |Mx], meaning 
that M is bounded. g 


The space G defined by (28) is called the graph of the mapping M. Requiring M to 
be closed is the same as requiring its graph to be closed. Theorem 12 is known as the 
closed graph theorem. The closed graph theorem has many surprising applications. 


Theorem 13. X is a linear space equipped with two norms |x|, and |x|2 that are 
compatible in the following sense: If a sequence {xn} converges in both norms, the 
two limits are equal. 

Suppose that X is complete with respect to both norms; then the two norms are 
equivalent. That is to say, there is a constant c such that for all x in X, |x|) < c|x|o, 
Ixl2 < elx|). 


Proof. Denote by X4, resp. X2 the space X under the 1-, resp. 2-norm. By hypoth- 
esis, both X; and Xo are complete. Compatibility clearly means that the identity map 
between X; and X3 is closed. Therefore, by the closed graph theorem, it is bounded 
in both directions. 


Theorem 14. X and U-are Banach spaces, M : X —> U a bounded linear map.” 


Assume that the range Ry is a finite-codimensional subspace of U; then Ry is 
closed. 


“Exercise 11. Prove theorem 14. (Hint: Extend M to X ® Z so that its range is all 
of U) Sse apis ee tat wl cae sce AEC AA REA y 


Exercise 12. Show that for every infinite-dimensional Banach space there are linear 
subspaces of finite codimension that are not closed. (Hint: Use Zorn’s lemma.) 


Theorem 15. X is a Banach space, Y and Z closed subspaces of X that complement 


each other: X = Y È Z, in the sense that every x in X can be decomposed uniquely 


E 


Patt 


172 BOUNDED LINEAR MAPS 


asx =y-+z,yinY, zin Z. Denote the two components y and z of x by 


je Pye Pes = 


(i) Py and Pz are linear maps of X on Y and Z, respectively. a 
(ii) P? = Py, PZ = Pz; PyPz = 0. 
(iii) Py and Pz are continuous. 


voc Proof: Parts (i) and (ii) are Obvious. To prove part (iii) we observe that since Y 
and Z are closed, and the decomposition is unique, it follows. that the graphs of Py 
and Pz are closed. The closed graph theorem does the rest. o0 


A iilap satisfying P? =P is Called a projection. 

We conclude this chapter by observing that complete metric spaces have proper 
subsets, called sets of second category, that are not unions of a denumerable number 
of nowhere dense sets. This allows a sharpening of the open mapping principle: 


Theorem 16. X and U are Banach spaces, M: X —> U a bounded linear map 
whose range Ry is a subset of U of second category. Then the range of M is all 
of U. 


Exercise 13. Prove theorem 16. 


HISTORICAL NOTE. Stefan Banach (1892-1945), a Polish mathematician, was one 
of the founding fathers of functional analysis. Banach spaces are named in recogni- 
tion of his numerous and deep contributions, and for having written the first mono- 
graph on the subject (1932). He was the inspiration of the brilliant Polish school of 
functional analysis. 

During the Second World War, Banach was one of a group of people whose bodies 
were used by the Nazi occupiers of Poland to breed lice, in an attempt to extract an 
anti-typhoid serum. He died shortly after the conclusion of the war. 

The Nazi attitude toward Poles is epitomized by the following story. After the 
conquest of France in 1940, when Hitler ruled most of Europe, a leading German 
mathematician, a member of the Nazi party, called on Elie Cartan, the dean of French 
mathematicians, to discuss the organization of mathematical life in the new European 
order. Cartan wanted to know how the Polish mathematicians would fit in. “Oh,” the 
German replied, “the Führer has declared the Poles to be subhuman.” 


BIBLIOGRAPHY 


Banach, S. Sur les fonctionelles linéaires, 1, IL. Studia Math., 1 (1929): 211-216, 223-339. 


Schauder, J. Uber die Umkehrung linearer, stetiger Funktionaloperationen. Studia Math., 2 (1930): 1-6. 


EXAMPLES OF BOUNDED 
LINEAR MAPS 


An important class of linear maps is furnished by integral operators. The first part 
of this chapter is devoted to investigating their boundedness in various norms. Let T 
and S be Hausdorff spaces, equipped with measures n and m. K denotes an integral 
operator mapping complex-valued functions f on T into complex-valued functions 
gons: 


a(s) = (Kf)(s) = iE K(s, 1) f(t) date). (1) 


The complex-valued function K(s, t) is called the kernel of K; f, K are assumed 
to be measurable and restricted so that (1) defines a measurable function g. Each 
subsequent theorem reveals a natural class for f, K, and g. We recall from chapter 4 
the L?-norms: 


I/p 
flee = (J. LFP dno) ; l<p<o. 


The space LP (T, n) is the completion of the space Co(T) in the L?-norm. The space 
LP (S, m) is defined analogously. The space L™ is the space of essentially bounded, 
measurable functions. 


‘16.1 BOUNDEDNESS OF INTEGRAL OPERATORS 


We start with conditions that guarantee that (1) is a boiindeéd map from L!(T, n) or 
L™(T, n) to L! (S, m) or L? (S, m). 


Theorem 1. 


(i) The map K defined by (1) is bounded as a mapping L! + L®, and 


173 


174 EXAMPLES OF BOUNDED LINEAR MAPS 


IK] < sup|K(s.7)1, E e) 
Sl 


-== provided that the quantity on the right is <00. 
“(ii) K is bounded as a mapping L® > L!, and 


Sieme pne pera ACN | -f4Retsc)hdm(s) dn wove (Qi) 


provided that the quantity on the right is < 00. 
(iii) K is bounded as a mapping L® => L™, 


[K| < sup f [K (s, t)|dn(t) (2ii7) 
AY 


if the quantity on the right is < œ. 
(iv) K is bounded as a mapping L! > L!, and 


IKI < sup | IK (s. t)|dm(s), (2iv) 
if the quantity on the right is < 00. 
Proof. By (1), for any s in S, 
Ig)| < [ IK(s, DIII dn). (3) 
The right side is < sup, |K (s, ÐI |f lz1, so 
Igloo = ap Ig(@s)I < up IK (s, IFz. 


This proves (2;). 
Integrate (3) with respect to dm over S: 


llu = i Ig(s)|dm(s) < | f IK (s, DILE) da0) dm(s) 
= [Lf Ke niames]ircot dno (4) 
The right side is 
< f Í IK (s,1)|dm(s) dn(t) | flo; 


this proves (2;;). 


BOUNDEDNESS OF INTEGRAL OPERATORS 175 


The right side of (4) is also less than 


sup | IKG, t)|dm(s)|flz1; 
t 


this proves (2;,). 
The right side of (3) is less than 


[ixc.olan fen 
this combined with (3) proves (2;;;). g 


Note that when K (s, t) and f(t) are both positive, the sign of equality holds in 
both (3) and (4). From this it is not hard to deduce 


Corollary 2’. When the kernel K(s,t) in (1) is nonnegative, the sign of equality 
holds in (2;;;) and (2;y). 


The transpose of K is easily written down; denote by ( , ) sand( , )r the standard 
L? scalar product on S and T with respect to dm and dn, respectively: 


(es = f soro dm(s), (5) 
(k, fr = | k(t) FO) dn(t). (5!) 


Multiplying (1) by (s) and integrating gives 


(Kf, ins = I K(s,t) f(t) h(s) dn(t) dm(s) = (f, K'h)r, (6) 
ST 


where - 


KWN = | _K (HO) dm). (6') 
AY 


vIn words, the'kernel of the transpose Kis the same as the Kérfiel Of Ki Witt the roles ~ T7 


ofthe variables s and ż interchanged. 

We recall now theorem 5 of chapter 15, according to which the norm of K’ equals 
the norm of K. We verify this in the case when the kernel K is nonnegative, and K 
is regarded as mapping L!(T) into L!(S). According to corollary 1’, |K] is given 


by formula (2;y). K’; on the other haid, maps LO(S) into L©(T), and its norm is 


given by formula (2;;;), with the roles of s and ¢ reversed. Clearly, |K’| = |K], as it 
should be. 


&: 


176 i EXAMPLES OF BOUNDED LINEAR MAPS 


We turn now to the L?-norms; which we denote as || ||; the corresponding norm `~ 
of K is denoted as ||K]]. 


Theorem 2. The map K defined by (1) is bounded as a map: L> > L?; and 


IKI? < [f iros, t)|dm dn, i l (7) 


ST 


provided that the quantity on the right is < 00. 


_____Proof. Applying the Schwarz inequality (see chapter 6) to the integral on the right ~~~ 
in (l ), we get -~ + eee me ee a ee Ra mee eae Ee A A a eA en ce Sem a an - 


ie? s f IKE ndn f IFO dn. 
T T 


Integrating both sides dm gives 


lel? < f / IK (s, f)? dn dml f 2, 
ST 


as asserted in (7). fi OE ait oO 


Inequality (7) is due to Hilbert and E. Schmidt. Another criterion has been given 
by Holmgren: 


Theorem 3. K as defined by (1) is bounded: L? + L?, and 


1/2 1/2 
IKI < (sue f |K («dn (sup if |K(s, Dam) ; (8) 
S. t 


provided that the quantity on the right is < œ. 
Proof. According to theorem 1 in chapter 6, 
lal = max (g,h)s. (9 
= WAll=! i À 
We are going to use (9) to estimate g = K f. By (6), 


ms = | | KG. FOG) dndm. (10) 


For any three positive numbers f, h and c, fh < cf*/2+h?/2c, so the right side of 
(10) is 


THE CONVEXITY THEOREM OF MARCEL RIESZ 177 


<f fixe, nif sirtoP + Eroe] dmdn. 


In the first term we integrate first with respect to s, in the second with respect to ¢. 
We get the estimate 


l 
E sup f IKs, t)| dm IAI + sup f 1KG, Dlana. (10^) 
2 t 2c S 


We take now || f || = 1 = |||] and choose c so that (10’) is as small as possible. This 


minimum is 
1/2 1/2 
(s» [) (f) ; (10%) 
t S 


Combining (9) and (10) with (10”), we conclude that for IFI = 1, Kf] is < the 
quantity in (10”). In view of the definition ||K|| = sup [Kf I, || ‘|| = 1, this proves 
(8). g 


16.2 THE CONVEXITY THEOREM OF MARCEL RIESZ 


The two factors appearing on the right in (8) are the square roots of the quantities 
appearing on the right in (2;;;) and (2;,). We noted in corollary 1’ that for a positive 
kernel these quantities are not merely upper bounds for the norm of K : L — L% 
and L! —> L! but equal to these norms. 


Definition. Denote by M(p, q) the norm of 
K: L?(T, n) — L4(S, m). (1) 
For integral operators whose kernel is > 0, we can restate inequality (8) as follows: 
M(2,2) < M71, 1) M? 0, 00). 


This turns out to be a special case of a far more general theorem due.to M..Riesz.. 


Theorem 4. Let M be a linear map of complex-valued functions defined on T into 
complex-valued functions defined on S. Suppose that M carries functions measurable 


with respect to n into E E E P mma 
__is bounded with respect to two pairs of norms:. E rere Si esa ces ach 


LPT, n) — L®(S,m) and L?'(T,n) —> L” (S, m). 


Conclusion: then M is a bounded map of LP® (T, n) + L9@) (S, m) where 


(E oh E ee ip 
pla) q(a) Po qo Pi qi 


1 


178 EXAMPLES OF BOUNDED LINEAR MAPS 


` Furthermore, M (p, q) is a log-convex function of its arguments: 
M(p(a), q(a)) < M'~"(po, go) M° (p1, 41); (12') 


here M(p, q) is the norm of the operator K defined in (11). . 


_. Proof, We sketch Thorin’s beautiful proof of this theorem. The starting point is 
~- the following result due to Hadamard: 


Three Lines Theorem. Let $(¢) be a bounded analytic function in the strip 0 < 
Re ¢ < 1. Denote 


~ N) = sup pla + im) ae 
n 
Then 
N(a) < N!" (0) N° (1). a3) 
Proof. Set c = log N (0)/N (1); by (13), the function (¢)e°> is in absolute value 
< N(0) for Reg = 0 and Reg = 1. So, by the maximum principle applied in the 
strip0 < Re ¢ <1, 
ee lġ (a + in)le < NO); 
from this and the definition of c, (13’) follows. O 


We turn now to the mapping M; by definition of the norm, 


| pee M(p.q)= sup |Mf\,,. 

lap C 
Furthermore, according to theorem 5 of chapter 5—Hölder’s inequality and equality— 
for any g in LY, |g|r9 = SUP In| =! \(g, 2)s|, where g’ is dual to q, 


] 1 


-+7 = de: 
4 q 
Combining the last two, with g = Mf, we get 
M(p,q) = sup IMF, h)|. (14) 
Iflpp = Vlad, ge =I 


We take p = p(a), q = q(a) as defined in (12). The complex-valued functions f and 
h can be factored as f = | fle”, h= |h|e'. Forany ¢ inthe strip0 < Re ¢ <1 
we define 


ft) = FIPO O eik, h) = tat O giv (15) 


THE CONVEXITY THEOREM OF MARCEL RIESZ 179 


where p(a), p(¢), and so on, are defined by formula (12). Note that f(a) = F 
h(a) = h. Since 1/p(¢) and 1/q(¢) are linear functions of ¢, so is 1/q'(¢). There- 
fore f(¢) and A(¢) are analytic functions of ¢, and so is 


PO = (MFE), AEs = J MFC) h(t) dm(s). (15) 


Lemma 5. Let f and h be functions of unit norm, lf lpia) = 1 |Alg(a) = l, and 
p(¢) defined as above. Define N (a) as the supremum of \@(t)| on the line Re t =a; 
we claim that 


N(0) < M(po, q0), N(1) < M(pi, qi). (16) 


Proof. Let’s take Reg = 0. Then ¢ = in, and so by formula (12), 


/ TA 
pla) _ pla) PE BL _4 Se A (17) 
ps) PO q'(S) I 
From (15), 
FEMI Ehe = lflms BEDE, = 117 fo (18) 


Since f and h were chosen so that | f |; om = 1, [Alzaa = 1, it follows from (18) 
that |f (in) 09 = 1, AGM) a = 1. Therefore, 


IM f(in)|z%@ < M(po, qo). (19) 


? 


Estimate @ defined in (15’) by Hölder’s inequality; using (19), and hn) af = 1 
we get 


lo(in)| = (Mf (in), Alin) < IM) FEMI 1h (km) | a S M (po, qo). 


This proves the first part of (16); the second part follows in exactly the same fashion. 
l o 


We apply now the three lines theorem, (13), to g defined by (15%; using (16), we 


get 


16) <= N(a) < M'™ Cpo, g0)M"(p a) o wW 
Since f(a) = f, h(a) = h, by (15°) 


Ar: 


According to (14) the supremum of the right side over all f, A of unit norm is the 
norm of M: 


180 EXAMPLES OF BOUNDED LINEAR MAPS 


M(p(a), q(a)) = sup |¢(a)I. 


Using the estimate (20) for the right side, we obtain the desired inequality-(12’). ieee 


16.3 EXAMPLES OF BOUNDED INTEGRAL OPERATORS 


.....-Theorems.2.and.3 both_furnish criteria for an integral operator to be bounded is 


L? —> L?. These criteria are very far from being necessary for boundedness, and are 
insufficient for proving the L? —> L? boundedness of the most important and most 
beloved mappings. We illustrate this on a number of examples. . 


16.3.1 The Fourier Transform 


The Fourier transform is defined by 


; dt 
EF nee —1st ——— 21 
(F f)(s) fe tO (21) 
whose kernel is 
Pepe ae (21') 


T = § = R, m and n Lebesgue measure. 

Clearly, for this K the right side of both (7) and (8) is 00, so neither theorem 2 nor 
theorem 3 can be used to show the L? > L? boundedness of the Fourier transform. 
Yet it is well known that it is bounded; see theorem 21 of Appendix B. On the other 
hand, we can use part (i) of theorem 1 to conclude that F : L! —> L™ is bounded 
by 1/ ./2x. We can now appeal to the M. Riesz convexity theorem, theorem 4, with 
(po, qo) = (2, 2), (p1, g1) = (1, 00) to conclude, after a brief calculation, 


Theorem 6. For 1 < p <2, F is a bounded map of LP —> L?/(P—}) and 


1 (2—p)/p 
F| < ( — ; (22 
m (zz) 


This inequality is called the Hausdorff- Young inequality after its discoverers. 


16.3.2 The Hilbert Transform 


Let A(t) be a real-valued function on R, fairly smooth—C ! will do—and tending to 
zero as |t| —> co at a reasonable rate, say O(t~*). 
The Cauchy integral 


By ea NERD (23) 


xi Jpt-C 


<< 


EXAMPLES OF BOUNDED INTEGRAL OPERATORS 181 


defines a function f(¢), or rather two functions, one analytic in the upper half-plane, 
the other in the lower half-plane. We will restrict ¢ to the upper half-plane. 
Writing ¢ = € + in, we can express the real and imaginary parts of f as follows: 


_ipa@e-D i n 
HO 5 ea e 


tif grad (23’) 
Using the properties imposed on A, it is not hard to show that 
(i) As |g] => 00, 
IFO = odg». (24) 
(ii) f() is continuous up to the real axis, and its real part there equals A: 
FE) = h(E) +i k(€), (25) 


where k is expressed in terms of h as the principal value integral 


h(t) 
E-t 


l 
k(E) = evf dt = (Hh)(E). (25°) 


The map H defined in (25°) is called the Hilbert transform; it relates the real to 
the imaginary part of the boundary value of analytic functions in the upper half-plane 
satisfying (24). 

Theorem 7. The Hilbert transform is an isometry of L?(R) — L? (R). 


Proof. Since f 2 is analytic in Im ¢ > 0, by Cauchy’s theorem, 


GPa oe aa ns s 


over every closed contour there. We take now the contour to consist of a line segment 


E Fie, -R <& < RK, anda semicircle € = Reosd, y= Rsind + € Now let 
me nd R=s 0d: It follows from (24) that the*integral‘over the semicircle in (26) 
tends to zero as R —> oœ, while it follows from (24) and (25) that the integral over 
the segment tends to l 


[ (h + ik} dt =0. 


eae ie (26°) 


Taking the real part of (26’) gives 


„Note that the kernel of H, 


182 EXAMPLES OF BOUNDED LINEAR MAPS 


| ide = ae, l i eu 7 _ | 
pa 


P wien .- a r g0 ea 
as asserted in theorem 7. 


Exercise 1. Show that 


H? =-I, where I = identity, = 2) 


(Hint: Consider the relation of the real part of —i f (E) to its imaginary part). 


1 
K(s,t)= —, (28) 


fails miserably the tests for boundedness given in theorems 2 and 3. 


gles 2 2 
The argument used above to show that H is an isometry of L^ — L* can be used 
to prove: 


Theorem 8. The Hilbert transform H is a bounded map of LP —> LP for all p, 
l< p<. 


Proof. Take p = 4, and consider the analytic function f t By Cauchy’s theorem, 
$ fidt =0. (29) 
We choose the same contour as in (26) and let €e —> 0, R —> oo, to obtain 
[OO tiketa =o, 
R -a 
The real part of this relation is 
f (ht — 6h? k? + kt) dé = 0. (29°) 
E 


id 2 219 p+ 
According to a well-known inequality, for a,b,c positive ab < ca2/2 + b2/2c:. 
applying this toa = h?, b = k?,c = 6, gives 


6n7k? < 18h* + dK. 


Setting this into (29’), we get 


5 | Rae < 17 f tat. 


This shows that H : L4 — L4 is bounded, and that |H | < 34. 


EXAMPLES OF BOUNDED INTEGRAL OPERATORS 183 


The same argument works for p any even integer. Then, using M. Riesz’s convex- 
ity theorem, theorem 4, we deduce the boundedness of H as a map of L? —> LP for 
any p,2 < p < oo. 

To complete the proof, we turn to theorem 5 of chapter 15, according to which the 
transpose H’ of H has the same norm as H. According to formulas (1), (1’), the kernel 
of H’ is obtained from the kernel of H by interchanging the roles of the variables. 
According to (28), interchanging the variables in the kernel of H merely changes the 
sign of the kernel. So 


E = —H. (30) 


If H maps L? > LP, H maps (LP)! = (LPY. According to theorem 11 of 
chapter 8, the dual of L? is LP, where 


—+-=l. (31) 


Note that if p > 2, p’ < 2. Combining (30) and (31) with theorem 5 of chapter 8, we 
conclude that the norm of H : L” —> LP” equals the norm of H: LP —> LP. Since 
the latter were shown to be finite for 2 < p < œ, it follows that they are bounded 
for 1 < p’ < 2 as well. 


Theorem 8, and the astonishing proof above, are due to M. Riesz. 


Exercise 2. Show that H is not bounded as a map: L — L. Deduce from this 
that H is not bounded as a map: L! > L!. 


16.3.3 The Laplace Transform 


Let f(t) be a complex-valued function on Ry: t> 0. Its Laplace transform Lf is 
the function on R+: s > 0 defined by 


oC A 
Op Oe | fied. (32) 


Theorem 9. The Laplace transform L is a bounded map of L? (Ry) > L?(R4), 
and 


Proof. We estimate -g (s) by the Schwarz-inequality: 


Ig(s)/? = 4 podr) = (i (Fe ae es cya) 


z ZOK, ar f” ae -124r (34) 
0 0 


oe pane a eee 


g 


184 , EXAMPLES OF BOUNDED LINEAR MAPS 


By a change of dasabte we can write the second integral as 


[letra = [ota s72 = Cs—V2 (35) 
0 0 


C =f E N = x? x7l2xdx sifa e* dx =r: -- (36) 
0 


Setting (35) into (34) gives 


BoP scsi? f MORe rdr, (37) 


Integrating (37) gives 


lal? = f oas sc f” f EOP saras. (38) 


Interchange the order of integration, and change variables in the s-integral: 


Cc lee) 
f et M25-12gy = f eu Pdu =C. 
m , 2 


So we get from (38) that 


lg? < C7 FIP. 


Using the value of C given by (36), we conclude that ||L|| < ./z. To show that 
equality holds, take f(t) = 1/./t fora < t < b, zero outside this interval; for this 
choice || f ||? = log b/a. Set g = Lf; it is not hard to show that as a tends to zero 
and b to œ, |\g||* > 2 (1 — £) log b/a. Combined with ||L|| < ./7 this proves (33). 

o 


Again note that the kernel of L, e~*', utterly fails the criteria for L? boundedness 
contained in either theorem 2 or theorem 3. 


Exercise 3. Prove that the Laplace transform L is not bounded as a map of 
LP (R4) > LP (R4), except for p = 2. (Hint: Try f(t) = e7% .) 


As remarked in chapter 15, theorem 8, if L is bounded, so is L?; therefore it 
follows by submultiplicity from (33) that 


IL? < LI? = x. (39) 


We claim that in (39) the sign of equality holds. To see this, we note that the kernel 
of the integral operator L, e~*’, is a real symmetric function of s and f. It is easily 


‘EXAMPLES OF BOUNDED INTEGRAL OPERATORS 185 


verified (see formulas (6),(6’)) that an integral operator L with a symmetric kernel 
satisfies 


(Lu, v) = (u, Lv). (40) 
That is, such an operator is its own adjoint; such operators are called symmetric. 


Theorem 10. Let L be a bounded, symmetric mapping of a real Hilbert space into 
itself. Then 


IL? = Ly. 


Proof. By submultiplicity, iL? | < IILI? is valid for all mappings, symmetric or 
not. To show the opposite inequality, we set v = Lu into (40); we get 


(Lu, Lu) = (u, Lu). 
The left side equals ||Lu||?; estimating the right side by the Schwarz inequality gives 


(Luh? < hulL u < Nel? 


Since this holds for all vectors u in H, IILI? < IIL?]| follows. 


Clearly, it follows from theorem 10 that the sign of equality holds in relation (39). 
The mapping L? is easily computed: 


[o.e] lora] co 
LAr) =f (Lf)(s)e"" ds = i | f(t) e die ds 


[o6] [oe] ioe) 
. t 
=| rof e UMS ds dt = IO | 
0 0 o é+r 


So we have proved 


Theorem 11. The integral operator f-—> g: 


sfo; 


gír) = T 


(41) 


.. is-bounded as a map-of L? Ra) —- L2(Ru),-and-i P a a aaa 


The map (41) is called the Hilbert-Hankel operator. Note that it its s kermel, We ay F, 
utterly fails the test for L? boundedness contained in either theorem 2 or 3. 


Exercise 4. Prove that the Hilbert-Hankel operator is a bounded map of L? —> LP 
for 1 < p < œ. 


For further information about integral operators, see Halmos and Sunder. 


186 EXAMPLES OF BOUNDED LINEAR MAPS 


~ -16.4 SOLUTION OPERATORS FOR HYPERBOLIC EQUATIONS 
. -We recall from section 5 of chapter.11 the class of symmetric hyperbolic operators 
of first-order. These are first-order partial differential operators of the form 


a esa ata aco) OB le a eases alg 
] 


The A j B are n x n matrices with real-valued entries that are reasonably smooth 
functions of s. We take them to be periodic in s. L acts on vector-valued functions 
u(s), whose components are real-valued; and assumed reasonably smooth, periodic 


—-~—functions-of-s: As scalar product for such functions we take the L- scalar product ~ 
over a period parallelogram F: É 


(u, v) = f u- vds, (43) 
F 


where the dot is the standard inner product for vectors. We assume the coefficient 
matrices A; to be symmetric: l 


AT = Aj. (44) 
In this case the formal adjoint L* of L takes the form 
L* =-L+K, (44') 
where 
K=B+B" -9 Ajj, Ajj =ðjAj. (45) 
Adjointness means that for smooth functions u and v, 
- (w, Lu) = (L*v, u); 
from this and (44) we deduce, upon setting v = u, that 
2(u, Lu) = (u, Ku), (45’) 


see equation (20'), chapter 11. 


Theorem 12. Let u(s, t) be a solution of 
ur + Lu = 0, (46) 

and aon that u is periodic in s. Then 
lu (T)I < cllu (0), (47) 


with c a constant that may depend on T. Here the norm is the L? norm (43) overa 
period parallelogram F. 


SOLUTION OPERATORS FOR HYPERBOLIC EQUATIONS 187 


REMARK 1. It follows from (47) that the solution u is uniquely determined by its 
initial value u(s, 0). Thus u(T) is related to u(0), and since equation (46) is linear, 
this relation is clearly linear; denote by S(T) the map relating (0) to u(T): 


S(T): u(0) =u(T); (48) 


S(T) is called the solution operator. Theorem 12 states that for each T the solution 
operator is bounded in the norm L?(F) > L?(F). 


Proof. Assume first that the matrix K in (45) is > 0 for all s. Take the scalar 
product of (46) with 2u, and integrate over F. Using (43), we can write 


2(u, uy) + 2(u, Lu) = 0, 


so by (45’) 
2(u, uy) + (u, Ku) = 0. (49) 


The first term can be written as d (u, u)/dt. Therefore, if the symmetric matrix K > 
0, it follows from (49) that ||z(t)|| is decreasing as function of T; from this (47) 
follows forall T > 0,c=1. 

If K is not positive, introduce v by u = e“ v as new dependent variable: set this 
into (46) to obtain v; + (k + L)v = 0. For k large enough, k + K > 0, so v satisfies 
(47) with c = 1. Therefore u satisfies (47) with c = e*7, T > 0. g 


kt 


Obviously the proof works also if u is not periodic in s but tends to zero as |s| —> 
oo so fast that u € L2(R"). 
We take now a special example of an equation of form (42): 


ar=( 4 a =({ a B=0. 


Denoting u = (v, w)’, we can write (42) componentwise as 


Ur + vy + wy = 0, 
Wr — Wy + vy = 0. 


ee ee ie ee calculation 


the classical wave equation. There is an explicit solution formula for the wave equa- 

tion that puts the solution operator S(T). inthe form _of_an_integral operator_The........ 
kernel of S(T) is in this case is not even a function but a distribution; it utterly fails 

to satisfy the L? boundedness criteria stated in theorems 2 and 3. 


188 A EXAMPLES OF BOUNDED LINEAR MAPS 
16.5 SOLUTION OPERATOR FOR THE HEAT EQUATION . 
We consider solutions u(x, t) of the heat equation.. == gs 


Uy = Uyy o (50) 


that are defined for all x and all £ > 0, and which =>. 0_sufficiently.rapidly.as =.. 


4 => o0. 


Theorem 13. Let u(x, t) be a solution of the heat equation, as above. Then for all 
T>0 


(i) \u(T)\max < u (O)lmax 
(ii) (Pp < uO. 
(iii) lu(T)lz2 < u(O)|,2. 


REMARK 2. Since (50) is linear, these estimates show that u is uniquely determined 
by its initial value and that the dependence of u on its initial data is linear. Therefore 
the solution operator 


S(t) : u(Q) > u(t) (51) 


is well defined. In terms of it, theorem 13 can be formulated so: |S(t)| < 1 as an 
operator mapping LP —> LP, p = œ, 1,2. 


Proof. Let k be an arbitrary positive number. Define v(x, t) to be 
v=ue, (52) 
Then v satisfies the equation 
vy + kv = dxx. (50) 


Since u(x, t) was assumed to —> 0 as |x| —> oo, the same is true of v(x, t). It 
follows that in the strip 0 < t < T, —œ < x < oo the function |v(x, t)| takes on 
its maximum. We claim that this is at a point where t = 0. Say that the maximum 
occurs where t = T. If v(x, T) > 0 at this point, then the first term on the left in 
(50') is > 0, and the second term on the left is > 0, while the term vyx on the Tight is 
< 0. At a negative minimum we find an analogous contradiction. So it follows that 


max |v(x, t)| = max |v(x, 0). 
| OSIST,x x 


This shows that v satisfies property (i) of theorem 13. Letting k — 0 in the definition 
(52) of v shows that also u satisfies (i) : 


S@ish  S:L” >L”. 


SOLUTION OPERATOR FOR THE HEAT EQUATION 189 


(ii) Consider the space of all solutions w(x, t) of the backward heat equation 
Wr = —Wyx, (53) 


defined for 0 < t < T, and all x, which tends to zero sufficiently rapidly as |x| — œœ. 
Multiply equation (50) by w, (53) by u, and add; the result can be written as 


(UW); = Wityy — UWyy. 


Integrate this with respect to x on R; integrate by parts. The fact that v, w tend to 
zero as |x| —> co shows that the integral of the right is zero. So we get 


d 
0= [wax = S fw dx; 


that is, f u wdx = (u(t), w(t)) is independent of f, in particular, 
(u (0), w(0)) = (u (T), w(T)). (54) 


Denote the inintial value u (0) by f; in the notation (51), u(T) = S(T) f. Similarly 
denote the final value w(T) by g. Analogously to what we have shown about solu- 
tions of (50), fort < T, w(t) is completely determined by w(T), and there is a linear 
relation between w(T) and w(0) that we denote by S’: 


w(0) = S'(T)g. 
We rewrite (54) in this new notation as 


(f,S'(T)g) = (S(T) f, 8). (55) 


The bracket (u, w) is a bilinear function: for fixed w, it is a linear functional of u and 
for fixed w, a linear functional of w. Thus (55) says that S and S’ are transposes of 
each other with respect to this bilinear pairing. 

It is easy to verify that 


lulz = Sup. (u, wye o 
Jwlmar=l 


According to part (i), |S/(T)8 |max < |8lmax, so we deduce from (55) that IS(T) f lp 
< |f|,1, as asserted in part (ii). 
—Part (iii); the boundedness of S~L* = L*, follows trom the Marcel Rieszcon 


vexity theorem; theorem™4-aboves* se mm1: rey 


~ REMARK 3. Here is another, direct proof for part (iii). Multiply equation (50) by 2u 
and integrate with respect to x over R. Integrate by parts on the right. By the fact that 
u(x, t) + Oas |x| — œœ, we get 


£ fias = - f Bax, 


ites 


190 : EXAMPLES OF BOUNDED LINEAR MAPS 


This shows that. f u(x, t)dx isa decreasing function of t, from which part (iii) 
follows. 

A similar direct proof can be given for part (ii). Let xj(t) be the points where 
u(x, t) changes sign: 


> 0 for x; <x <Xxj41, j even 


i . es | <0 forx; <x-<xj4), j odd. Í k56) 
Then- ria ei kata E ATE NTA anc A E R SEE ie pia Vian geen Gest es (oat, cecum a AA ni ai e ted 
i | fint) 
lul = Zev f u(x, t) dx. (57) 
Differentiate this with respect to 7. Using caJculus and equation (50), we get 
: © d . [XH . fXj+ 
i FOl =J | uy dx = $ (1V I Ux 
i Xj xj 
=} (=I) (ux jai) — ux). (57) 


It follows from (56) that the first x-derivative of u alternates in sign at the points x ji 


ere, >0 for j even 

ES <0 for j odd; 
therefore the right side of (57’) is < 0. This shows that {u(t)|,;1 is a decreasing 
function of 1, as asserted in part (ii). o 


We give now yet another proof of theorem 13; the initial value problem for (50) 
can be solved explicitly: i 


1 _ eet 
mes d= 575 | for PMA" gy, 


This shows that S is an integral operator whose kernel is exp{(x — y)?/4t}/2./mt. 
We appeal to parts (iii) and (iv) of theorem 1 to prove parts (i) and (ii) of theorem 13, 
and to theorem 3 to prove part (iii). o 


Theorem 13 holds for second-order parabolic equations in any number of space 
variables; the proofs sketched above apply to the general case, except of course the 
last one based on the explicit formula for the solution. 


16.6 SINGULAR INTEGRAL OPERATORS, PSEUDODIFFERENTIAL 
OPERATORS AND FOURIER INTEGRAL OPERATORS 


The above-named classes of operators play a dominant role in modern analysis, in 
particular, the modern theory of partial differential equations. They extend enor- 
mously the class of traditional integral operators; they unify integral and differential 


BIBLIOGRAPHY 191 


operators. In particular, inverses of various differential operators can be expressed in 
terms of such operators. 

For the theory of these operators we refer to the literature, in particular, to 
Hormander and Taylor. We call attention to a particularly sharp result concerning the 
L* boundedness of pseudodifferential operators due to David and Journée. 


BIBLIOGRAPHY 


David, G. and Journé, J-L. A boundedness criterion for generalized Calderon-Zygmund operators. An. 
Math., 20 (1984): 371-397. 


Halmos, P. R. and Sunder, V. S. Bounded Integral Operators on LP Spaces. Ergebnisse der Math. und 
ihrer Grenzgebiete, XV. Springer, Berlin, 1978. : 


Hardy, G. H., Littlewood, J. E., and Polya, G. /nequalities, Cambridge University Press, Cambridge, 1934. 


Hausdorff, F. Eine Ausdehnung des Parsevalschen Satzes tiber Fourierreihen. Math. Zeitschrift 16 (1923): 
163-169, 


Hérmander, L. The Analysis of Linear Partial Differential Operators. Springer Verlag, 1983. 


Riesz, M. Sur les maxima des formes bilinéaires et sur les fonctionelles linéaires. Acto Math., 49 (1926): 
465—497. 


Riesz, M. Sur les functions conjugées. Math. Zeit., 27 (1927): 218—244. 


Schur, I. Bemerkungen zur Theorie der beschriinkten Bilinearformen mit unendichen vielen 
Veriinderlichen. J. fiir Math., 140 (1911): 1-28. 


Taylor, M. E. Pseudodifferential Operators. Princeton University Press, Princeton, NJ, 1981. 


Thorin, G. O. Convexity theorems generalizing those of M. Riesz and Hadamard with some applications. 
Seminar Math. Lund, 9 (1948). 


Young, W. H. On the determination of the summability of a function by means of its Fourier coefficients. 
Proc. London Math, Soc., 12 (1913): 71-88. 


Weyl, H. Singuldre Integralgleichungen mit besonderer Beriicksichtigung des Fourierschen Integraltheo- 
rems. Collected Works. Springer, Berlin. 


THEIR ELEMENTARY i = 


17.1 NORMED ALGEBRAS 


We saw in chapter 15 how to multiply, by composition, two linear maps of Banach 
spaces into other Banach spaces, provided that the target space of the first map is 
the domain space of the second map. In this chapter we specialize to the study of 
bounded, linear maps of a Banach space X into itself. Any two such maps may 
be composed so that the set £(X, X) of such maps forms an algebra, with a unit. 
Each element of £(X, X) has a norm, with the properties embodied in theorem 2 
and theorem 8; that is to say, the norm is subadditive and submultiplicative. Such an 
algebra is called a normed algebra. 

There is a group of important results about bounded linear maps of a Banach space 
X into itself that depend only on the algebraic and analytic structure of £(X, X). 
In this chapter and in chapter 18, we derive these results in the context of normed 
algebras. 


Definition. A normed algebra is an associative algebra over the complex numbers. 
Each element has a positive norm, |M| = 0 only for M = 0. The norm satisfies the 
usual conditions: 

IM +NI < IMI +N], [cM] =|c||Mj, and |[NM|<|NI|M|. 0) 
A normed algebra with a unit is just that, where the norm of the unit is 1: 


I] = 1. (2) 


Definition. A normed algebra L that is complete with respect to the norm is called 
a Banach algebra. 


192 


NORMED ALGEBRAS 193 


The theorems contained in this section are valid for all Banach algebras £ with a 
unit, not just for L(X, X). . 


Definition. An element M of a Banach algebra £ with a unit is called invertible if it 
has an inverse N = M7! in L£: 


NM = MN = I. (3) 
M is said to have a left inverse A, respectively a right inverse B, if 


It is an elementary fact of algebra that if M has both a left inverse A and a right 
inverse B, then these are equal. For multiply the first relation in (4) by B on the right: 


AMB =B. (5) 
Using associativity and the second relation (4) we get A = B. 
Theorem 1. 
(i) If M and K in L are invertible, so is their product MK, and 
(MK)! = K`'M"!. (6) 
(ii) If M and K commute, 
MK = KM. (7) 
and if their product is invertible, so are M and K separately. 


Proof. (i) is an obvious consequence of associativity. To show (ii), denote the 
inverse of M K by N: 


(MK)N =I = NMK). 


By N and K’s associativity, we conclude that KN is a right inverse of M. By com- 
mutativity of M and K and associativity, we get 


I = N(MK) = N(KM) = (NK)M, 


_-___—_from_which_we.conclude. that. NK is a.left- inverse-of-M.-So-Mis-invertible. —-— -G -— 


Theorem 1 is purely algebraic; not so 


Theorem 2. Suppose that K in L is invertible; then so are all elements of L close 
enough to K. Specifically all elements of form L = K — A are invertible, provided 
that 


l 


IA] < TE 


(8) 


194 BANACH ALGEBRAS AND THEIR ELEMENTARY SPECTRAL THEORY 


Proof. We treat first the special case K = I; we claim that all elements of the form 
I — B are invertible, provided that - 


E (fee ee 


The inverse of I — B is given by the geometric series 


Sia eS Sac a >—B"-=S: 95 : 
0 


Clearly, since |B| < 1, the sequence of partial sums is a Cauchy sequence; since £ | 


multiplied termwise; multiplying (9’) on the-left by B, we get 


co foe) 
BS =B} B” =) Bi =S-I, 
0 1 


from which it follows that (I — B)S = I. Similarly, multiplying (9’) by B on the right. 
shows that S(I — B) = I. This shows that S is the inverse of I — B. 
We return now to (8); we factor i 


K-A=K(1- K'A). (10) 
Set B = K7! A; by submultiplicativity, and by inequality (8), 
B| = [K'A] < [K~] CESI 
Using (9'), we invert (10): 


K-44)! = (I K7! A) K”! = 3 (K-a) Kel, (10’) 
0 


This proves that (K — A) is invertible. o 


Definition. The resolvent set of M in £ consists of those complex numbers A for 
which 


AI—M 


is invertible; the spectrum of M consists of those å for which it is not. The resolvent 
set of M is denoted by p(M), its spectrum by o (M). 


In chapter 11, section 11.4, we defined the notion of an analytic function of a 
complex variable whose values lie in a Banach space over C. Since a Banach algebra 
is, in particular, a Banach space over C, we may speak of analytic functions whose 


~~ (i) The spectrum o (M) is a closed, bounded, nonempty set in C. 


NORMED ALGEBRAS “195 


values lie in a Banach algebra. As the reader may immediately verify, the product of 
two such analytic functions is analytic. All the standard paraphernalia of the theory 
of analytic functions—the Cauchy integral theorem, the Cauchy integral formula, 
power series, Laurent series, and so on—are meaningful and valid for functions that 
take their values in a Banach algebra. 


Theorem 3. 


(i) The resolvent set p(M) is an open subset of C. 


(ii) The resolvent of M, defined on p(M) as (tI — M)~!, abbreviated as ç — 
M)~!, is an analytic function of ¢ on p(M). 


Proof. Suppose that À is in o (M); then by theorem 2 applied to K = AI — M and 
A=AI, l 


(A—A)I-M = (AL—-M—AD 


is invertible for A small enough. This proves part (i). 
By formula (10’), 


co 
(QA) -M7' =} a- M"! h"; (11) 
0 


this shows that the resolvent can be expanded in a power series around each point A 
of p(M), convergent for |A] < |(A — M)7!\~!, this proves analyticity, as asserted in 


part (ii). g 
The series (11) converges when |h| is less than |(A — M)~!|~!. From this we 
deduce 


Corollary 3’. For any à in p(M), denote by d(A) the distance of À to the spectrum 
of M. Then 


[(e—M)7'y > a! PP ET TS) 


Theorem 4 (Gelfand). 


(ii) The spectral radius of M, denoted as |o (M|, is defined as 
lo(M)| = max |Al. (12) 
Aino (M) 


We claim that 


lo (M)| = Jim, [Me JIE, az’) 


A 


196 BANACH ALGEBRAS AND THEIR ELEMENTARY SPECTRAL THEORY 


Proof Since p(M) i is open, its complement, oM), is closed. Applying ©, (9') 
to A = E 1M, we see that -- 


CI-M)! =c7! (1- ay = omy n=l a a3) 


converges for |¢~!M| < 1, that is, for |¢| > |M]. This proves that every -such-¢—--—- 
-belongs .to p.(M);. it follows that every A. in .o.(M)-satisfies|A|.<..|M| This proves... 
that the spectrum is bounded. 

Representation (13) is a Laurent series for the resolvent.around co; the first term 
is ¿TL Integrating (13) with respect tof around the contour C: BIS =C,c> IMI 


-gives o T ee ees 


¢ (¢-—M)"!d¢ =I, where dt = i (14) 
2a 


If the spectrum of M were empty, then (¢ -M)~—! would be, by part (ii) of theorem 3, 
an everywhere regular analytic function. Then by the Cauchy integral theorem, ap- 
plicable to analytic functions in a Banach space, the integral on the left in (14) would 
be zero; since the right side isn’t, this proves that o (M) is not empty, a result due to 
A. E. Taylor. 

We investigate now more precisely the radius of convergence of the series (13). 
Let k be any integer; then-we can decompose n = kq +r, 0 < r < k. So M” = 
Meat H = = (M‘)9 M” from which we deduce that 


(M"| < [M] (Mé\?. 


This gives the estimate 
ee eee 
ak ea A a er gh ad ern 


Thus the series (13) converges absolutely if 


Mi‘ 


> ma that is, |¢] > [ME 1/4, (15) 


Thus every ¢ satisfying (15) belongs to the resolvent set; it follows that every A in 
sae n salieaes JA| < [ME 1/4, k In view of definition (12) it follows that |o(M)| < 
[Mé |!/; since this holds for all integers k, 


lo(M)| < lim inf |M“. (16) 


We turn now again to representation (13) of the resolvent and express the coeffi- 
cient of ¢~"~! by the Cauchy integral formula: 


a - M)!" dt =M". (17) 


FUNCTIONAL CALCULUS f g 197 


As path of integration we may choose any contour C in the resolvent set of M that 
winds once around o (M). It follows from the definition (12) that |¢| = |o(M)| + 8 
is such a contour. We may then estimate from (17): 


IM"| < c(l (M| +8t!, c= max  |(¢I1—-M)7!]. 
I |=oa (M)+6 


Take the nth root, 
yer!" < cl!" (Jo (M| + 8)! +!" 
and then form the lim sup, 
lim sup |M"|!/" < jo (M| + 8. 
no 
Since this is true for any 6 > 0, it is true for 6 = 0; 
lim sup {M"|!/" < |o (MDI. (18) 


Comparing (16) and (18), we conclude that the lim inf and lim sup are equal, and we 
obtain Gelfand’s formula (12’) for the spectral radius. E 


17.2 FUNCTIONAL CALCULUS 


Since £ is an algebra,we can form any polynomial p of an element M of £ by setting 


N 
pM) = J ajMİ. (19) 


(19) defines a mapping from the algebra of polynomials into the algebra £ that is, 
clearly, a homomorphism. This homomorphism can be extended to a larger class of 
functions than polynomials; for instance, we can define 

n 


M _ 


n!’ 


More generally, we can define 


<a FMY nN aaa am aaa” (° 0) ea 


for any entire function 


FO =Y ang". (20°) 


Still more generally it follows from (12’) that we can define (20) for any function 
f(¢) whose power series converges in a circle whose radius exceeds |o(M)|. We 
propose now a still further extension: 


198 BANACH ALGEBRAS AND THEIR ELEMENTARY SPECTRAL THEORY 


Definition: Let M be an element of L, f(t) a function analytic in a domain G 
containing o (M). Let C be a contour in G N p(M) that winds once around every i 
point in ø (M) but winds zero time around any point of the complement of G.. We. .-. 
define t a ne g 2 S fie 5 E EROE E EE E E 


M= ¢ -MISCA (21) 


. By.the Cauchy-integral theorem,(21).is-independent_of-the-choice-of the-contour. 


Theorem 5. 


~ (i) For f a polynomial,-definitions (21) and (19) are-the same:~—----—-—- == 
(ti) The mapping (21) from the algebra of functions analytic on an open set con- 
taining o (M) into £ is a homomorphism. 


(iii) 
o(f(M)) = f(o(M)). (22) 


(iv) Let f be analytic on an open set containing o (M), and g analytic on an open 
_ set containing f(a (M)). Denote their composite by h, 


Al) =8(FS)); - (23) 


then 
h(M) = g(f(M)). (23’) 


Proof. (i) Replacing f(¢) by a polynomial in (21) and using formula (17) shows 
that (21) and (19) are the same. The same argument shows that (21) is the same as 
(20) for f analytic in a disk of radius > o|(M)|.. 

(ii) For any pair of complex numbers ¢ and w, 


I- M) —(@I-M) = (¢ - w). 


Suppose that both ¢ and w belong to p(M). Multiply the above identity by 
(¢ -= M) (w - M)! G —@)7!: 


-oio -M — (¢ —M)}=(¢-—M)}(@—M)7!. (24) 


Relation (24) is called the resolvent identity. 

The mapping f —> f(M) given by (21) is obviously linear. We show now that 
it is multiplicative. Let f, g both be functions analytic in an open set G D a(M). 
We choose two contours C and D, both in G N p(M) so that they have no point in 
common, and so that D lies inside C. That is, C winds once around every point w 
of D, while D winds zero times around every point ¢ of C. Using definition (21) for 


FUNCTIONAL CALCULUS 199 


f and g with contours C and D, we write f (M)g(M) as a product of two integrals, 
which we express as a double integral. Then we use the resolvent identity (24): 


SDAD = È E-M w- MY! FOs o)de do 
=¢ fe -ow -M7 = © -MFO glo dt do 
-$| C = ay! pO de (wD) dw 
= $ | (c= o)"e(o)d| (6 —M)"! feat. (25) 


Since C winds once around every point w of D, the integral with respect to ¢ above 
in the first term is, by the Cauchy integral formula, f(w). Since D does not wind 
around any point ¢ on C, the w integration in the second term above is zero; so we 
conclude from (25) that 


F(MDg(M) = f (w —M)~! f(w)g(w) de, 


which by (21) is h(M), where h(w) = f(w)g(w). This proves that the mapping (21) 
is multiplicative. 

(iii) We have to show that u belongs to the spectrum of f(M) if and only if u is 
of the form 


w=f(a), Aino(™M). (26) 


If u is not of form (26), then f(¢) — u does not vanish on o(M). Therefore 
(f(t) - wT! = g(¢) is analytic in an open set containing o (M), thus we may 
define g(M) by formula (21). According to part (ii), [f (M) — u — I] (M) = hM), 
where h(¢) = (f(¢) — u) g(¢) = 1. Thus h(M) = I, and g(M) is the inverse of 
fM) — ul. This proves that 2 does not lie in o ( f (M)). 

On the other hand, suppose that u is of form (26). Define the function k(¢) by 


fO- 


HOS 


Glearly,-k-is-analytic-in-an-open-set-containing-o-(MI), so-k(M)- can- be-defined: bye- — 


- -(21)-Since (¢.— A) k(t) = f(¢) — f(A), it follows from part (ii) that 
(M — AT) kM) = fM) — fae (27) 


Since À belongs to o (M), the first factor is not invertible. We appeal now to part (ii) - 
(iv) By assumption, g(w) is analytic on f (o (M)). Since by part (iii) the spectrum 
of f (M) is f(a (M)), it follows that formula (21) can be applied to g in place of f, 


G: 
v 


200 BANACH ALGEBRAS AND THEIR ELEMENTARY SPECTRAL THEORY 
and f (M) in-place of M and Din place of C: 


=PO=TM) KO Zo (28) 


For w on D, (w — f Gli is an analytic BICHON on a therefore > applying 
formula (21) once more, we get an see 


er pany =f Ce FO 


provided that the contour C does not wind around any of the points w on D: Now set 
(29) into (28): 


z0 M) = pi ¢ © -Mo f(0)) godt do. (30) 


We reverse the order of integration; since C does not wind around points of D, it 
follows that D winds around every point ¢ of C. By the Cauchy integral formula, 


f wt odo, 


where we have used (23). Setting this in (30) on the right we get, by (21), ACM), as 
asserted in (23’). 


Definition (21) and properties listed in theorem 5 are called the functional calculus 
for operators. Relation (22) is called the spectral mapping theorem. 

Suppose that the spectrum of M can be decomposed as the union of n peer 
disjoint closed components: 


aoM) = 0) U---Uoy, aj;No=¢. (31) 


For each j, denote by C; a contour in the resolvent of M that winds once around 
each point of oj but not op, k 4 j. We define woe 


P; = fe ~M)~lar. (32) 


Theorem 6. 
(i) The P; are disjoint projections, that is, 
Po =P; and PjP=0  forj #k. (33) 
(ii) 
Y Pj=L (34) 


j 


(iii) Pm #0 ifon is not empty. 


BIBLIOGRAPHY * 201 


Proof. Relations (33) are corollaries of part (ii) of theorem 5. Since C = > Cj 
winds once around every point of o(M), (34) follows by summing (32) over all j, 
and using (14). We leave the proof of part (iii) to the diligent reader. o 


Exercise 1. Show that if P is a nonzero projection, that is, satisfies P? = P Æ 0, 
then 


IPI > 1. (35) 


Exercise 2. Show that the spectral radius |o (M)| depends upper semicontinuously 
on M in the norm topology, namely, that if lim M, = M, then 


lim sup |o (M,})| < lo (MD). 
Exercise 3. Show that | exp M| < exp |M]. 


Exercise 4. Show that if 0 does not belong to the o (M), and if 0 can be connected 
to oo by curve that lies in o (M), then log(M) can be defined so that 


exp log(M) =M. 


Exercise 5. Define Ly to be the closure of the algebra generated by M and (¢ — 
M)~!, ¢ in p(M). Show that £y is a commmutative subalgebra of £. 


NOTE. For the history of the spectral theory of operators in a Banach space, see 
pp. 607-609 of Dunford and Schwartz. 
The term “spectrum” is due to Hilbert, a remarkable anticipation of its meaning in 


quantum mechanics. Pa ERR 
fe ae 
BIBLIOGRAPHY SEA 


i wa a 
IP i s 
Dunford, N. Spectral theory I, Convergence to projections. Trans. AMS, 54 (1943): 185-217... 
Gelfand; I- Mz Normierte Ringe- Mat Sbornik; N-S., SY C1941)r3=24.--" == -> ; 


M i Eee err 


Lorch, E. R. The spectrum of linear transformations. Trans. AMS, 54 (1942): 238-248. ooe 


Nagumo, M. Einige analytische Untersuchungen in linearen metrischen Ringen. Jap. J. Math., 13 (1936): 
61-80. 


Riesz, F. Sur certaines systemes singuliers d’équation integrales. An. École Norm. Sup. (3), 28 (191 D: E 


g6 0 


Taylor, A. E. The resolvent of a closed transformation. Bull. AMS, 44 (1938): 70-74. 
Wiener, N. Note on a paper of M. Banach. Fund. Math., 4 (1923): 136-143. 


fev 


og- 


 GELFAND’S THEORY -— 


OF COMMUTATIVE. 


BANACH ALGEBRAS 


A Banach algebra, defined in chapter 17, is an associative, complete normed algebra 
over C. The algebras £ we deal with in this chapter are assumed to have a unit, 
denoted as I, with |I| = 1, and to be commmutative: 


MN=NM for all N, Min L. , (1) 


The main topic, as in chapter 17, is invertibility, approached here through two 
concepts that turn out to be equivalent: multiplicative functionals and maximal ideals. 


Definition. A multiplicative functional p in a Banach algebra £ is a homomorphism 
of £ into C. 


Although defined purely algebraically, homomorphisms of a commutative Banach 
algebra with a unit have the following analytic property: 


Theorem 1. Every homomorphism p of a commutative Banach algebra with unit 
into C is a contraction, that is, satisfies 


|p(M)| < MI. (2) 
Proof. Since M = IM for every M, and since p is a homomorphism, 
p(M) = p(IM) = pl) pM). 
It follows from this that, unless p = 0, a trivial case, 


pQ@) =1. (3) 


GELFAND’S THEORY OF COMMUTATIVE BANACH ALGEBRAS 203 


Let K be an invertible element of £, that is, K N = I. Then, by (3), 
PK) pN) = p(KN) =p) = 1; 
this proves 


Lemma 2. If K is invertible, p(K) 4 0. 


Suppose now that contrary to (2), |p(MD| > |M] for some M; then 


aa! M 

=M K 
satisfies 

|B| < 1. (4’) 


It follows then from (9), (9’) of theorem 2 in chapter 17 that 
K=I-B 


is invertible. On the other hand; by (4) and (2), 


M 
PK =p® -p (=a) =1-1=0; 


this contradicts the observation in lemma 2: if K is invertible, p(K) Æ 0. Hence (2) 
holds for all M. 


The main result of this chapter is the converse of lemma 2. 


Theorem 3. An element K of a commutative Banach algebra L with a unit is invert- 
ible if and only if 


p(K) £0 65) 


for all homomorphisms p of £ into C. 


Proof. As observed already in lemma 2, if K is invertible, then p(K) Æ 0 for all 


there is-a-homomorphism-.p-<-L=—>-C-such-that-—ercc.mce eens eee eee 


p(K) = 0. 7 ©) 


To construct such a p, we need some algebraic and analytic notions. 


homemorphisms-p-What remains-to-be shown-is the converse: if Kis-notinvertible-———_ 


Definition. Let £ be a commutative algebra with unit. A subset Z of £ is called an. 
ideal if it has these three properties: 


204 GELFAND’S THEORY OF COMMUTATIVE BANACH ALGEBRAS 


(i) Z is a linear subspace of £. 5 E 
(ii) For any M inL, MLCT nn. 


(iii) Z is nontrivial, meaning that it is neither {0} nor‘all of £. 


Note that an ideal cannot contain an invertible element N. For then it would follow 
from (ii) that Z contains every element of £, contrary to (iii). In Pees T does” 


not contain I... we Be i ees ena aie 


Lemma 4. Let £ and A be commutative algebras.with unit over the same field, q a 


homorphic map of L onto A. Suppose that q is nontrivial in this sense: 


- (i) -q-is-not-an-isomorphism= m ee 
(ii) q(£) does not consist of {0} only. 


The kernel of the homomorphism q, consisting of all K in L mapped into 0 by q, 
is an ideal in L. Conversely, every ideal T in L is the kernel of some nontrivial 
homomorphism. 


Proof. It is easy to see that the kernel of q is an ideal. To show the converse, define 
A to be 


A= L(mod T) = 


NID 


meaning that A consists of equivalence classes of elements of L, two elements M 
and M’ being equivalent mod T if their difference belongs to T: 


M=M mod Z ifM—M’ eT. 


Addition and multiplication of equivalence classes is performed by picking arbitrary 


‘representatives of each class, adding or multiplying them, and then forming the class 


to which they belong. 
The mapping q is taken as the natural assignment to each element M of £ the class 
of all M’ congruent to M mod T. Clearly, the kernel of q is T. B 


Lemma 4’. L, A and q as in lemma 4, J an ideal in A. The inverse image of J is 
an ideal in L. 


Proof is obvious. 


We have noted earlier that an ideal contains no invertible elements. Conversely: 


Lemma 5. Every nonzero element K of L that is not invertible belongs to some 
ideal. 


Proof. That ideal is the principal ideal KL generated by K. It is easy to verify 
properties (i) and (ii); since KL does not contain the identity I, property (iii) holds 
too. o 


= Proof. If M were not closed, its closure M would, by lemma 9, be an ideal, prop-. 


GELFAND’S THEORY OF COMMUTATIVE BANACH ALGEBRAS 


nw 
oS 
U 


Definition. A maximal ideal is an ideal that is not contained in any other ideal. 
Lemma 6. Every ideal is contained in some maximal ideal. 


Proof. The ideals in £ are partially ordered by inclusion. Let {Zy} be a collection 
of totally ordered ideals. We claim that their union is an ideal: it is easy to see that 
properties (i) and (ii) hold. Since the identity I is not contained in any of the Zy, it 
is not contained in their union, either; this proves property (iii). We appeal now to 


.Zorn’s lemma to conclude that among all ideals containing a given one there is one 


that is maximal. o 
Combining lemma 5 and lemma 6 we deduce lemma 7. 
Lemma 7. A noninvertible element K of £ belongs to some maximal ideal M. 


Lemma 8. Let M denote a maximal ideal of L. Then A = LIM is a division 
algebra; that is, every nonzero element of A is invertible. 


Proof. If A contained a nonzero noninvertible element C, 7 = CA would be an 
ideal contained in A. Now consider the natural inclusion map q : £ > LIM =A, 
The inverse image 7 of J would be, according to lemma 4’, an ideal in £, and it 
would properly contain M, the inverse image of 0 in A. Since M was assumed 
maximal, this is not possible. o 


We turn now to some analytical results: 


Lemma 9. The closure T of an ideal T in a commutative Banach algebra L with 
unit is an ideal. 


Proof. It is easy to verify that Z has properties (i) and (ii), and that it does not 
consist of 0 alone. We claim that Z does not contain I. Since Z contains no invertible 
elements, and since by theorem 2 of chapter 17, all N contained in the open unit ball 
centered at I are invertible, I does not belong to Z. _ 0 


Lemma 10. Every maximal ideal M in a commutative Banach algebra L with unit 
is closed. 


erly containing M. This contradicts maximality. o 


Lemma 11. £ as above, T a closed ideal in L. Then A = L/T is a Banach algebra 
in the natural norm of the quotient algebra. 


Exercise 1. Prove lemma 11. 


206 GELFAND’S THEORY OF COMMUTATIVE BANACH ALGEBRAS 


The following result, due to Mazur, is the keystone to the sequence of lemmas 
above. l _ 
Theorem 12. Let A be a Banach algebra with a unit that is a division algebra; then 
A is isomorphic to the field of complex numbers. 


` Proof. As defined in chapter 17, the spectrum of an element K in A is the set 
_...of complex numbers £ such that ¢1 — K is not invertible, I being the unit of A. 


According to theorem 4 of chapter 17, the spectrum of any K is not empty. This 
means that there is a complex number, call itx, such that kI — K is not invertible. 
Since A is assumed to be a division algebra, this can only be if «I — K is the zero 


~ @lementin’A, tn-which-case xI = K. Thus-every-element:K-is.a-multiple-ofthe-unit; 
the mapping > 


K —> «x (7) 
is the isomorphism of A with C. o 
We are now ready to prove theorem 3; that is, given any noninvertible element K 

of L, we construct a homomorphism p : £ > C such that p(K) = 0. 
According to lemma 7, K belong to some maximal ideal M; according to 
lemma 10, M is closed. According to lemma 11, £/M is a Banach algebra, and 


according to lemma 8, £/M is a division algebra. Then, by theorem 12, £/M is 
isomorphic to C. The composition 


É 
:£L> — >C 8 

PM rv (8) 

is ahomormophism of £ onto C, whose nullspace is £. Since K belongs to £, 
pM(K) = 0, 
as asserted in theorem 3. O 
We restate relation (8) as follows: 
Theorem 13. Let L denote a commutative Banach algebra with unit. To each max- 
imal ideal M in L there corresponds a homomorphism pm of L —> C whose 
nullspace is M: 
pmM(K)=0 iffKin M. 

Conversely, the null space of every homomorphism of L onto C is a maximal ideal. 


We remark that the converse statement is a purely algebraic fact. We draw now 
some consequences of theorem 3: 


PY > 


GELFAND’S THEORY OF COMMUTATIVE BANACH ALGEBRAS 207 


Theorem 14. £ as above, N any element of L. The spectrum of N is 


o (N) = {p(N)} (9) 


as p ranges over all homomorphisms of £ into C. 

Proof. By definition of spectrum, ¢ belongs to o (N) iff ¿I — N is not invertible. 
According to theorem 3 this is the case iff p(¢I — N) = 0 for some p. Since, by (3), 
pŒ) = 1, it follows that ¢ lies in o (N) iff ¢ = p(N) for some p. O 

We show next how to use the characterization of the spectrum contained in theo- 


rem 14 to give a new proof of the spectral mapping theorem. We recall the functional 
calculus developed in chapter 17; formula (21) there defines f (M) as 


fM = $ -M7 fedt (10) 


for every f analytic in an open set containing ø (M). According to part (iii) of theo- 
rem 4 in chapter 17, equation (22), 


o (FM) = f CM). (11) 


The integral (10) is the limit in the sense of the norm of the usual partial sums em- 
ployed in defining a Riemann integral. It follows that for any bounded linear map 
£ : L — C we may apply £ inside the integral on the right in (10): 


ery = G e (6 -07 Fat. (10) 
In particular, (10’) holds for every homomorphism p of £L —> C: 


PEND = $ p (€ -M7') foar. (12) 


Since p is a homomorphism, 


p (« 2 m)~") = (t — pM)". 


Setting this into (12) gives 


one =f pT fe oy 


It follows from theorem 14 that p(MD belongs.to o (M). By construction, the contour 
C winds once around every point of o (M); so by the Cauchy integral formula the 
right side of (13) equals f(p(M)). This implies that 


p(f(M)) = f (p(MD). l (14) 


- 208 ; GELFAND’S THEORY OF COMMUTATIVE BANACH ALGEBRAS 


According to-theorem 14, as p runs through all homomorphisms the left side of (14) 


_ fills up the spectrum of f (M) while the right side fills up f (o (M)). So we conclude 
. that (11) holds. o 


The homomorphisms p constructed in this chapter can be regarded as functions 


- of the essocines maximal ideal Si and.of the element N of the algebra £: 


E N EE eu SS hake. -p= = p(M, N). — . (15) . 


For fixed N, p is a function on the space J of masimál ideals. 


Definition. The functions p defined above constitute the. Gelfand tepnesentation gi 


the commutative Banach algebra £ on the Pere J ofitsmaximal ideals. © =~" 


Theorem 15. 


(i) The Gelfand representation is a homomorphism of L into an algebra of 
complex-valued functions on the set J. 
(ii) The representation is a contraction: |p(M, N)| < |NI. 
(tii) The spectrum of N is the range of the function representing N. 
(iv) The unit Lis represented by p(M, J) = 1. 
(v) The functions p separate points of J; that is, given two distinct maximal ideals 
M and M', there is an N such that 


p(M, N) # pM’, N). 


Proof. Part (i) expresses the fact that each p is a homomorphism; part (ii) restates 
(2), part (iii) restates theorem 14, part (iv) restates (3), and part (v) follows from 
theorem 13. 


_ Definition. The natural topology on the space J of maximal ideals is the coars- 


est one in which all functions p(M, N), N fixed, are continuous. This is called the | 
Gelfand topology. 


Theorem 16. J is compact in the Gelfand topology. 
Proof. Consider the product space 


P = | [Dmi (16) 
L£ 


where D, denotes the disk |¢] < r in C. Each disk is compact; therefore by Ty- 
chonov’s theorem so is their product, P, in the product topology. By (15), p(M, N) 
lies in the disk Djn|. We map J into P by assigning to each M of J the point 


[] 7.) (17) 
L 


BIBLIOGRAPHY 209 


By (v) of theorem 15, (17) is an embedding of J in P. Clearly, the Gelfand topology 
is the same as that induced by the embedding. Since P is compact, the coripaciness 
of J would follow from knowing that the image of J under (17) is closed. Let £ = 
Il ty be a point of the closure of (17). We claim that p(N) = tn isa homomorphism 
of L > C, namely that N+M = tN + iM INM = tnim, and ten = cty. These 
relations are, according to part (i) of theorem 15, satisfied by the points (17). Since 
these relations involve only two factors at a time, they remain true on the closure 
of (17). o 


BIBLIOGRAPHY 


Gelfand, I. M. Normierte Ringe. Mat. Sbornik, N.S., 9 (1941): 3-24. 


S 
i 


APPLICATIONS OF == 


GELFAND’S THEORY 


19.1 THE ALGEBRA C(S) 


S is a compact Hausdorff space, £ = C(S), the algebra of continuous, complex- 
valued functions on S, normed by the maximum norm: 


|f| = max | f(s)I. (1) 
Given any point r on S, we can associate with r the homomorphism p;: L —> C: 
Pr(f) = f(r). (2) 

As noted in theorem 13 of chapter 18, the kernel of p, is a maximal ideal 
My =(f = f(r) = 9}. (3) 


Theorem 1. Every maximal ideal M in C(S) is of form (3). 
Exercise 1. Prove theorem 1. 


Theorem 1 shows that the maximal ideal space of C(S) can be identified with S 
itself; the abstract theory gives nothing new! 


19.2 GELFAND COMPACTIFICATION 


S is a locally compact Hausdorff space, C,(S) the algebra of all complex valued, 
bounded, continuous functions on S, normed by 


210 


GELFAND COMPACTIFICATION 21 
| f| = sup | f(s). (4) 
AY 


Given any point r in S, Mp, defined by (3) is a maximal ideal. We claim that if S 
is not compact there are others. We demonstrate this when S = R: 

Let {sn} be a sequence of points tending to co. Define Z to consist of all f such 
that 


im, f (Sn) = 0. (5) 


Clearly, Z is an ideal, and equally clearly, the functions in Z have no zeros in com- 
mon. As observed in chapter 18, Z is contained in some (in fact many) maximal 
ideals M; yet clearly, M is not of form (3). 

Although theorem | fails in the noncompact case, the following is true: 


Theorem 2. The set of maximal ideals M, of form (3) is a dense subset, in the 
Gelfand topology, of the space of all maximal ideals of Cy(S) when S is a locally 
compact Hausdorff space. 


Proof. Let Meo denote a maximal ideal. Open sets in the Gelfand topology that 
contain Moo contain M that satisfy for some € > 0 


[p(M, hj) — p(Meoo, hj)| < €, Leg Sk, (6) 


where 4j are elements of the algebra, and p(M) is defined by equation ( 15) of 
chapter 18. Seth; = fj +cj, cj = p(Moo, hj); we can write (6) in the form 


IPM, fpl <e plMoo. fjf)=0, Ls j<k. (7) 


We claim that every open set of form (7) contains some M,. For suppose not, then 
for every r in S, M = M, does not satisfy (7). Since by definition (14), chapter 18, 
and definition (3) of M,, p(M,, fj) = fj(r), violating (7) means that for all r, 


mar) (r)| 2 €. (8) 


Since by (7), p(Moo, fj) = 0, fj belongs to Moo, j = 1,...,k. Since Mag is 


f= Ae | (9) 


___ belongs to Moo. From (8) we conclude that 


frye . (10) 


212 APPLICATIONS OF GELFAND’S THEORY 


for every r in S. This shows that f is an invertible element, and thus cannot belong - 

to an ideal. This contradiction shows that every neighborhood_of A4,.-contains some——__. 

point M,. a Eg ties cates lea Sie 
The maximal ideal space of C,(S) in the Gelfand topology, is called the Gelfand 

compactification of S. This space contains a dense subspace homeomorphic to S; 

the restriction of the space of continuous functions on.the.Gelfand.compactification —~. 

_...l0 this subspace is.the.space ofall bounded-continuous functions on £. 

The next example is more fun. 


"19.3 ABSOLUTELY CONVERGENT FOURIER SERIES 


£ is the algebra of all complex-valued functions f (8) on the unit circle S! that have 
an absolutely convergent Fourier series, namely of the form 


fO@)=)> cel”, 1) 
(Fi= > lcn] < o0. (12) 


It is casy to verify that the norm (12) is submultiplicative: | fg| < |F| lgl. So Lisa 
Banach algebra. The function f = 1 is its unit; its norm equals 1. For each point w 
———— ofS, the mapping 


Pol f) = f@) = (13) 
is a homomorphism of L —> C. Conversely: 


Theorem 3. Every homomorphism into C of the algebra L of functions on S! with 
norm (12) is of form (13). 


Proof. According to theorem 1 of chapter 18, every homomorphism p of a Banach 
algebra —> C has norm 1. Since e!? and e~!® both have norm 1, it follows that 


psn pee 7 
Since p is a homomorphism, 
p (e°) p (e) = pay =1. (15) 


Combining (14) and (15), we conclude that |p(el®)| = 1; therefore we can write 
p (e°) =e”, w real. (16) 


Since p is a homorphism, p(e!”®) = e!" for all integers n, and for all finite sums 


ANALYTIC FUNCTIONS IN THE CLOSED UNIT DISK 213 


Pp 0 cne'”®) = ocne", (17) 


Since p is continuous and J` |cy| < 00, (17) holds for infinite sums as well which 


converge in the sense of the norm (12). This proves that p is of form (13). 1] 


According to the main result of chapter 18, theorem 3, an element f of a Banach 


algebra £ is invertible if p(f) # 0 for all homomorphisms p of L + C. In view of 


theorem 3 above we conclude 


Theorem 4. [f a function f defined on the unit circle has absolutely convergem 
Fourier series, and doesn’t vanish at any point of S}, then its reciprocal fT! «alse 
has absolutely convergent Fourier series. 


This celebrated theorem, due to Norbert Wiener, is astonishing, for there is no 


obvious relation between the Fourier series of a function f and of its reciprocal, 
The situation is similar in more variables: 


FOS) Gees Oi Orth nmin ait 
>| Ileal < 00. 


The analogues of theorems 3 and 4 hold, with similar proofs. 


19.4 ANALYTIC FUNCTIONS IN THE CLOSED UNIT DISK 
A is the algebra of all functions f(z) analytic in the open unit disk |z| < 1, and 


continuous up to the boundary |z| = 1. Clearly, A is a Banach algebra under the 
norm 


fl ae EKO: © (18) 


Given any point w in |z| < 1, the mapping 


Pw(f) = f (w) (19) 


-18.8 homomorphism of A -> C. Conversely: 
Theorem 5. Every homomorphism of the algebra A —> C is of form (19). 


Proof. By theorem L, chapter 18, such a homomorphism p has norm < 1. Since 


____according to (18), f(z) = z has norm = 1, it follows that 


|p(z)| <1. (20) 


214 APPLICATIONS OF GELFAND’S THEORY 


Denote the value of p(z) by w; since p is a homomorphism, p(z”) = w” and for all 
finite sums 


PES 
(Sra!) =} ajw'. (21) 
í 0 

We can express: (21) so: when f(z) is a polynomial, P(f) = f(w). Since every 


——- ___f_in_A_can_be_approximated uniformly.on |z|- < 1. by.polynomials, .and.since_p.is. ........ 
continuous, this relation holds for all f in A, as asserted in theorem 5. 0 


Exercise 2. Show that every function in: A can be approximated uniformly on the 
unit disk by polynomials. 7 0 TTT a T 


Theorem 6. Let fi, ..., fm be a collection of m functions in A that have no com- 
mon zero in the disk. Then there are functions g1,..., 8m in A such that 
pein se (22) 
Proof. Consider the set Z of all functions of form 
| f= igi. hjinA. (23) 


Unless Z is all of A, it is an ideal in A, and therefore it is contained in some maximal 
ideal MM. According to theorem 13 of chapter 18, M is the null set of a homomor- 
phism. By theorem 5, all homomorphisms are of form (19); therefore a maximal 
ideal is a collection of all functions in A that vanish at some point w. Since the func- 
tions fj, j = 1,...,m were assumed to have no common zero, they cannot belong 
to the same ideal. This means that (23) cannot be an ideal, therefore the functions f 
of form (23) are all of A. In particular, f = 1 is of that form; this proves (22). g 


The situation is similar for analytic functions of k complex variables, defined in 


the polydisk 
T [del <1) 
and continuous up to the boundary. The analogues of theorems 5 and 6 hold, with 
analogous proofs. 
19.5 ANALYTIC FUNCTIONS IN THE OPEN UNIT DISK 


The algebra 6 of bounded analytic functions in the open unit disk |z| < 1. These 
form a Banach algebra under the norm 


If] = sup |f@)I. (24) 


|z|<1 


WIENER’S TAUBERIAN THEOREM l 2 


ta 


Every mapping of form (19), |w| < 1, isa homomorphism: 5 —> C; therefore the 
set of f satisfying 


f(w) =0, |w| <1 (25) 


is a maximal ideal Mw. Not all maximal ideals in 8 are of this form. However, the 
following is true: 


Theorem 7. The set of maximal ideals My of form (25) is a dense subset, in the 
Gelfand topology, of the space of all maximal ideals. 


The analysis presented in the proof of theorem 2 shows that the theorem is equiv- 
alent to the following proposition: 


Theorem 7’. Let f\,..., fin be a collection of functions in B that have the property 
that 

DFO >1 (26) 
for every z in |z| < 1. Then there exist m functions gj € B such that 

X: gj fj =i. (27) 


Theorem 7 is called the corona theorem. The implication (26) = (27) is subtle 
and deep. It was shown to be true by Lennart Carleson, using function theoretic 
arguments. Tom Wolff succeeded 1979 in giving an entirely different proof, using 
methods of partial differential equations (e.g., see Koosis). 


19.6 WIENER’S TAUBERIAN THEOREM 


The next application deals with a Banach algebra without a unit. This is easily reme- 

died by adjoining a unit J in a purely formal fashion. That is, if A is a Banach algebra 
without a unit, the enlarged algebra £ consists of elements of the fori AIE M; X mT 
a complex number, M an element of A. Addition is defined componentwise, and 
multiplication according to the distributive law: 


(HIF N) AIFM) = ATF EM + AN FNM 


Note.that A is a maximal ideal in £. 
Norm in the enlarged algebra is defined as follows: 


[AL + Mj = |A] + IMI. 


Clearly, this new norm is subadditive and submultiplicative, and the unit I has 
norm 1. 


ye 


216 APPLICATIONS OF GELFAND’S THEORY 


“We.take A to be the space L! of complex-valued integrable functions on R, which 
we denote by lowercase letters. We define multiplication to be convolution: 


(f * g)(s) = [ to -wewadu (28) 


The change of variable of integration s — u = v shows that convolution is commuta- 


aan S | summa Adini Canad SRS EE RT e e 


= lf ale S eleli 8) 


Proof. Suppose that both f and g are continuous and of compact support. Then 
sois f * g, and itis the L!-limit as A —> 0 of finite sums of the form 


X fe- jAg GAA. (29°) 
j 


Note that the L)-norm of a translate of f is the same as the L! -norm of f; therefore, 
using the subadditivity of norm, we conclude that the L!-norm of (29) is bounded 
by 


flu JGA A. (30) 


We let A tend to zero; the sum in (29’) tends to f * g, and (30) tends to Ifl lelg 
so we obtain in the limit inequality (29). 0 


Actually the proof presented above applies to any norm for the function f that is 
translation invariant, leading to the following important inequality: 


lf eal < fileli, E (31) 


where | | stands for any translation invariant norm for functions on R. From now 
on all norms in this section denote the L!-norm. 

We denote by £ the convolution algebra of L! functions augmented formally by 
a unit, which we denote by e. We proceed now to determine all maximal ideals of £, 
or rather, what is equivalent, the multiplicative linear functionals p whose nullspaces 
they are. One of the maximal ideals in the convolution algebra is L! itself. For any 
other p, there is an f in L! such that P(f) # 0; we normalize f so that 


p(f) =1. (32) 


Let t be any real number, and denote by f; the translate of f: 


fils) = f(s -1). (33) 


WIENER’S TAUBERIAN THEOREM 


217 
Lemma 9. f, depends continuously on t in the sense of the norm, that is, 
li ith — fil = 
fim lfi+h — fil = 0. (34) 


Proof. This is clearly true when f is continuous and has compact support; since 
every f in L! can be approximated in norm by such functions, (34) holds for all f. 


We define now the function x (t) as 
x(t) = plf). (35) 


Since p is a continuous linear functional, it follows from lemma 9 that X is a con- 
tinuous function. Since the norm of f, is independent of t, it follows that x(t) is 
uniformly bounded by | f| for all real r. 

Let t and r be two real numbers; we claim that 


Sir * f = fr * fr. (36) 


To see this, we apply the definition (28) of convolution to the left side: 


(far * f(s) = J fa -—t-—=r) f(s—v)dv. 


Introducing v — r = u as new variable of integration transforms this to 


| to — t) f(s — r — u) du; 


this is fy * fr, as asserted in (36). 
Now let p act on (36). Since p is multiplicative, we get that 


P (fir) P) = PD pCfr). 


Using the definition (35) of x and the normalization (32), we deduce that n satisfies 
the functional equation 


x(t +r)=x(t)x(r). (37) 


. We have already seen_that_x_is continuous and_bounded; according-to-a-well-knowa———. 
result of analysis, all such solutions of (37) are purely imaginary exponentials:..... .. .. 


x(t) = e", E real: ` Tma “Soe = 


Having determined the action of p on f and its translates, we want to determine 
its action on all g in the algebra. We start by remarking that since p is continuous, and 
p(f) #0, p does not vanish in a ball of small enough radius around f. Since f can 
be approximated arbitrarily closely in norm by continuous functions with compact 


218 APPLICATIONS OF GELFAND’S THEORY 


. Support, there is such a function for which ‘p does not vanish. This shows that we 
may take f to be continuous and of compact support. 

Let g be any continuous function of compact support. Then the sum (29’) tends in 
the-sense-of the-L! -norm-to-f*-g. Using the notation (33), we can rewrite (29’) as 


Yo fia BGA)A. me ae G2). 
-== det-p-act-on the-preceding sum;since-p-is-linear,-we-get: ie O ene 


mo -bining-(35)-and-(38),-we-get-the-following approximation to_p(f +-2)-— ~~ 


2E gUGA)A. (40) 


—-—-As A-—>-0,(40)-tends-to 


x 


102 [ elf o(v)du, (41) 


that is, to the Fourier transform of g except for the factor 1/./7. Since (39) tends to 
f * g, and since p is continuous, it follows that p(f * g) = g(&). Since p is multi- 
plicative, and p(f) = 1, it follows that 


P(f * 8) = ple) = 8E) (42) 


for all continuous g of compact support. Since both p and the Fourier transform 
at € are continuous functions of g, it follows that (42) holds for all g in L!. We 
summarize: 


Theorem 10. Every multiplicative linear functional of the convolution algebra L! is 
of the form (42), E some real number. Conversely, for every £, (ey is a multiplicative 
linear functional. 


_ Proof. The first part has been demonstrated above; the second part paraphrases 
the well-known fact that the Fourier transform of f * g is the ordinary product of the 
Fourier transforms of f and g: 


fxg = fé. o 


Theorem 11 (Wiener). Let f be an L! function on R whose Fourier transform f (E) 
is nonzero for any £. Then the translates of f span all of L!, that is, any L! function 
can be approximated in norm by linear combinations of translates of f. 


REMARK 1. The condition of the nonvanishing of f is necessary, for if f vanishes 
at n, so does the Fourier transform of any translate of f, of their linear combinations 
and L! limits of them. 


Proof. When g is continuous and of compact support, f * g is the L! limit of the 
linear combination (29’) of translates of f. It follows from lemma 8 that for any g 


WIENER’S TAUBERIAN THEOREM 219 


in L!, f xg is the L!-limit of f * gn, where gn are continuous, of compact support, 
and tend to g in the L!-norm. Therefore, to prove the theorem, it is sufficient to show 
that the space of functions f * g, g in L!, is dense in L!. o 


Lemma 12. Let f be a function as in theorem 11, and m an L! function whose 
Fourier transform m has compact support. Then m can be written as 


m= f *g, ginL!, (43) 


Note that the L! functions whose Fourier transforms have compact support are 
dense in L!; therefore theorem 11 follows from lemma 12. 


Exercise 3. Show that the set of L! functions m whose Fourier transform has com- 
pact support are dense in L!. 


Proof of Lemma 12. Choose a compact interval J so large that it contains the 
support of ñ. Construct an auxiliary function h in L! whose Fourier transform is 
real and has the following properties: 


7 1 on 7 
mes ie 1 everywhere. 0 
Define f° to be the conjugate of f, in the following sense: 
f(s) = f(-s). (45) 


As is well known, the Fourier transform of the conjugate of f is the complex conju- 
gate of that of f: 


Fee) = FE). (45) 


We make use now of the Banach algebra £ obtained from the convolution algebra 
a D by formally adjoining a unit e. Wè may regard e as the Dirac ô of distribution 


theory, see Appendix B. Elements of this algebra are of the form 


he +k, kin L!,àinC. 


The functional po(Ae +k) = A is, obviously, multiplicative. It follows from theorem ~ 


sese that all others are of the form’ 
phe+k)=A+Kk(&), € real. 


Take, in particular, the element e — h + f + f°. According to the above, po(e — h + 
— oe fF; clearly nonzero. For any other p, 


ple-h+ fx f*)=1-h+|fP. (46) 


cee 


id 


220 : APPLICATIONS OF comalico THEORY 


` It follows from (44) a and the nonvanishing of f that (46) is positive for all £. Thus 
E~ h+ f * ff does not belong to the nullspace of any multiplicative linear functional. 
We claim that e — h + f * f° is invertible in £; for according to theorem 3 of 
chapter 18, an element of a Banach algebra is invertible iff it does not belong to the 
.. nullspace of any multiplicative linear functional. 


__Denote by d the inverse of e —h +-f-* f°: 


E fe =r fx cay 5 a co So GaP Sear ne oye 


peep this relation by m: 


{e Say =f) KERM ST era 


We claim.that (e — h) * m is zero; To see this regard e as the Dirac d-distribution. 
The Fourier transform of (e — h) * m is (1 — hi, see section B.5. According to the 
construction (44), 1 — h is zero in J while m is zero on the complement of J. This 
shows that the Fourier transform in the sense of distributions of (e — h) * m is zero; 
therefore so is (e — h) * m itself. Setting this in (47) gives 


fx fixdem=m; 


this is the desired relation (43), with g = f° +d *m. oO 


We give now an indication how theorem 11 is used in applications. 
Let n be a bounded function on R, which has a limit as s tends to co: 


lim n(s) =a. (48) 
SCO 
Let f be an L! function, normalized so that 


f toau =i (49) 
It follows then easily from (48) and (49) that 
lim (f *n)(s) =a. (50) 
SHH 


The question is: can one deduce (48) from (50)? A result of this kind is called a 
Tauberian theorem. 
Theorem 13. Ler n be a bounded function on R, and f an L! function normalized 
as in (49). Suppose that (50) holds, that is, that the convolution of f and n tend to a 
number a as s tends to oo. Suppose that the Fourier transform of f is nowhere zero. 
Then n(s) tends to a in the mean in the following sense: for every value of d 


] s+d 
lim 7 n(u)du =a. (51) 
S 


COMMUTATIVE B*-ALGEBRAS 221 
Proof. (50) implies that for any t, 
jim (fi *71)(s) =a. (50) 


Taking linear combinations of (50°), we deduce that for 


h=} cj fy (52) 

lim (h #n)(s) =a) cj =a f hau. (53) 

Clearly, (53) holds for any function A that is the L!-limit of a sequence of functions 
of form (52). 

If the Fourier transform of f is nowhere zero, then it follows from theorem 11 that 


(53) holds for all / in L}. Take, in particular, 


l/d forO<s<d 


os 0 elsewhere; (54) 


for this h, (53) becomes (51). im 


REMARK 2. (51) is not quite (48) but close to it. If, for example, we know that # is 
uniformly continuous on R+, then (51) implies (48). 


Exercise 4. Suppose that n is slowly increasing in the sense that 


sup n(u) — r(v) (55) 


u—l<v<u 


tends to 0 as u tends to oo. Show that then (51) and (55) imply (48). 


“NOTE. Wiener showed how to use his Tauberian theorem to prove the prime number 
theorem. 


a 5. Let f be a function of class r2 on R. Show that the translates of /' span_ 


2 if and only if f does not vanish on a set of positive Lebesgue measure. 


REMARK 3. A. Beurling has shown that if f belongs to all LP, 1 < p, and if J 


5 vanishes on a set of positive Hausdorff measure œ, 0 < œ < 1, then | the translates 
__.of f do not span LP for p < 2/(2—@). | 


19.7 COMMUTATIVE 5*-ALGEBRAS 


Gelfand theory is particularly useful for studying commutative algebras of Operators 


over a complex Hilbert space H. We recall from chapter 15 the Tonon of the trans- 
pose of an operator A mapping a Banach space X into X. When A maps a complex 


222 APPLICATIONS OF GELFAND’S THEORY 


Hilbert space. H into o itself, its conjugate transpose, called in this context its adjoint, 
————1s-another-operator- A" --H-=-H-s0 that thepair satisfi es- 


(Ax, y) = (x, A*y) (56) 


for all x, y in H. Every bounded operator A has-an adjoint-A*; the-following alge= ea 


_..braic properties follow directly from the definition: ee 


(A +B) = A* +B, (KAV = kA*, 
A™ =A, (AB)* = B*A*. (57) 


We denote the operator norm-of A-as-Aj-——_— - 

Theorem 14. For a bounded, linear mapping A: H —> H, H a Hilbert space, 
IAI = || A*I, (58) 
|A*Al] = JAI. (59) 
Proof. Take the supremum of the absolute value of each side of (56) for all x, y 
of unit length: ||x|] = ||y|| = 1. On the left take the supremum first with respect to 
y, then with respect to x; the result is ||A ||. Reverse the order on the right, obtaining 

|A* ||; this proves (58). 


To prove (59), take y = Ax in (56); estimating the right hand by the Schwarz 
inequality gives 


Axl? < lxi JA*Ax| < lxIŽIA*A]. 


Taking the supremum over all unit vectors x gives |A|]? < A*A ||. Since by sub- 
multiplicity and (58), A*A |] < A*I [|All = IIA ||2, (59) follows. 


Definition. A complete normed algebra with a * operation that has properties (57), 
(58), and (59) is called a B*-algebra. 


An element A of a B*-algebra is called self-adjoint if A* = A; B is called anti- 
self-adjoint if B* = —B. 


Theorem 15. 


(i) The spectrum of a self-adjoint A of a commutative B* -algebra is real. 
(ii) The spectrum of an anti-self-adjoint element B is imaginary. 


Proof. According to Gelfand’s theory the spectrum of A is the set of numbers 
p(A), p being any multiplicative linear functional. We claim that for A self-adjoint, 
P(A) is real; to see this, write 


P(A) =a +ib. (60) 


COMMUTATIVE 3*-ALGEBRAS 223 
Let ¢ be any real number; set T = A + it]. Then T* = A — it], and 
TT =A? +r] (61) 
From (60), p(T) =a +i(b +t), so 
IPT)? = a? + (b +1)’. (62) 


According to theorem | of chapter 18, every multiplicative linear functional is a 
contraction: 


IPT)? < ITI. (63) 
Using (62) on the left, and (59) and (61) on the right, we deduce from (63) that 
a? + (b +1)? < ITTI] < IAI? +2. 
Ifb 4 O, this is clearly false for t large and of the same sign as b. So b = Q; this 
proves part (i) of theorem 15. 


If B is anti-self-adjoint, it follows from (57) that iB is self-adjoint. Thus part (ii) 
follows from part (i). g 


Theorem 16. Every multiplicative linear functional p on a commutative B* -algebra 
satisfies 


p(T*) = p(T). (64) 
Proof. Define A and B as 
T+ T* -T-T . ; l 
À => ; B = K 
2 2 (63) 


it follows from (57) that A* = A, B* = +B. Decompose T and T* as 


= T=A+B, T*=A-B. 


Since p is linear, 


p(T) = p(A)-+-p(B)s--pF2=p(A)-—p B) mn (66) 


According to theorem 15, p(A) is real and p(B) imaginary; therefore (64) follows 
from (66). ; i 


Theorem 17. For every T in a commutative B* algebra,- .------ +--+... — mamm ns m 


TI] = lo (T)]. (67) 


224 APPLICATIONS OF GELFAND’S THEORY 


‘Proof, We first:prove (67) for the self-adjoint elements A of the algebra. When 
A* = A, (59) becomes 


JAI = AIP. (68) 
_____It follows from (57) that if A is self-adjoint, so is any power of A; therefore it follows 
from (68) that a.. 
5 AZUL = JA” 12. e e erry eee 
Applying this to n = 1,2,4,8,..., 2k = m:and combining these identities, we 
deduce that 
IA™] = AI”. 


Taking the mth root, we conclude that 


lim a"i" = {Al m= 2, (69) 
k— oo 


According to theorem 4 of chapter 17, the limit on the left of (69) is equal to the 
spectral radius of A; this proves (67) for T = A self-adjoint. 
We turn now to arbitrary T; it follows from (64) that for any p, 


— p(T*T) = p(T*) p(X) = p). (70) 


Since every point in o (T) is of the form p(T), it follows from (70) that the spectral 
radius of T*T is the square of the spectral radius of T: 


|o (T*T)| = lo (T). (71) 


By (57), T*T is self-adjoint; since we have already proved (67) in the symmetric 


case, l 
|T*T | = lo (T*T)]. (72) 
Combining this with (71) gives 
IT*TI| = lo DP? 
By (59), ||T*T|| = ||T|/?; this yields (67) for all T. o 


NOTE. Gelfand invented his theory of commutative Banach algebras to derive the re- 
sults presented in section 7. The application to Wiener’s theorem was an afterthought. 


HISTORICAL NOTE. Tauberian theorems were named so by Hardy and Littlewood 
after the obscure Austrian mathematician Alfred Tauber, 1866-1942, who wrote the 
first paper on the subject in 1897. His major contributions were to actuarial science. 
In 1942 he was deported to the Theresienstadt concentration camp. 


BIBLIOGRAPHY 


tv 
tv 
UI 


BIBLIOGRAPHY 


Beurling, A. On a closure problem, Arkiv Mat., 1, (19xx): 301-303. 

Carleson, L. Interpolation by bounded analytic functions and the corona problem. An, Math., 76 (1962): 
547-559. 

Gelfand, I. M. Normierte Ringe. Mat. Sbornik, N.S., 9 (51) (1941): 3-24. 


Koosis, P. /ntroduction to Hp Spaces. Cambridge Tracts in Mathematics, 115. Cambridge University 
Press, Cambridge, 1998. 


_ Wiener, N. Tauberian theorems. An. Math., (2), 33 (1932): 1-100. 


—20- 


_ EXAMPLES OF OPERATORS - 


—AND THEIR SPECTRA —— nae 


In this chapter we discuss and illustrate phenomena in the spectral theory of operators 
that go beyond the mere fact that operators belong to the Banach algebra of bounded 
linear maps £(X, X); we exploit the fact that these are operators that act on elements 
of a Banach space. 


20.1 INVERTIBLE MAPS 


First we address the operation of invertibility of a map M in L(X, X). By definition, 
M is invertible iff it maps X onto X in a one-to-one fashion. The inverse is necessar- 
ily bounded, thanks to the closed graph theorem; see theorem 11 of chapter 15. Thus 
there are only two ways for M to fail to be invertible: 


(a) Mis not one-to-one. 
(b) M is not onto... 


It follows that if the product MK of two maps is invertible, then K is one-to-one, 
and M is onto. If M and K commute, meaning that MK = KM, we deduce from 
invertibility of the product that both M and K are both one-to-one and onto, and 
therefore both M and K are invertible. 

We have seen in theorem 2, chapter 17, that if an element K of a Banach algebra 
is invertible, then so are all nearby elements K — A, |A| < const. For maps we have 
the following additional result: 


Theorem 1. X a Banach space, K : X -» X a bounded linear map that maps X 
onto itself. Then all nearby maps, namely those of the form 


K-A, JA] < €, with € small enough, (1) 


also maps X onto itself. 


226 


INVERTIBLE MAPS 227 


Proof. It follows from theorem 9 of chapter 15 that there is a constant k such that 
for any z in X there is an x in X such that 
Kx =z, |x| < kļzl. (2) 


4 


We claim now that for any linear map A : X —> X whose norm satisfies |A| < 1/k, 
K — A maps X onto itself. Furthermore we claim that for any u there is an x such 
that 


(K-A)x=u, [xl< oll (3) 


This x will be constructed as the limit of a sequence of approximations {x,,}, defined 
recursively as follows: 


Kxn+1 = Axr + u, xo = 0. (4) 


By (2), for = 1 equation (4) has a solution x, satisfying |x| < kļuļ. Forn > 1 we 
construct x„+] by substracting two consecutive equations (4): 


Ktn41 — Xn) = A (xn — Xn—1). (5) 
By (2), (5) has a solution (x,41 — Xn) satisfying 
IXn41— Xn) < KAI len — Xal- (5°) 


Since k|A| < 1, it follows from (5’) that the sequence {xn} is convergent. Letting 
n — œ in (4) gives (3), with x = lim xn. Since x9 = 0, x = Do (n+l — Xn); 


lx] <>) beast — xnl < > CAD" Ixil, 


which is < k/(1 — k|A])|uz|, as asserted in (3). 


Recall from chapter 17 that the spectrum o (M) of a bounded linear map Mis the 
set of all complex numbers À for which (AI — M) is not invertible. 


z Theorem 2. Let M : X — X be a bounded linear operator, À a boundary point of 
the spectrum of M. Then the range of A — M is not all of X. 


Proof: Since-Xis assumed to be a boundary point of ø (M), it is the limit of a 
sequence A,, in the resolvent set of M. According to corollary 3’ of theorem 3 in. 
chapter 17, the norm of the resolvent is > the reciprocal of the distance to some 
point of the spectrum: 


K-M! > Ig -= A!. (6) 


we 


228 EXAMPLES OF OPERATORS AND THEIR SPECTRA 


This implies that there is a u, depending on ¢, such that (¢ — M)~!u-= x is large 
compared w: ee 


(= M)x =u, J| > |u|. (7) 


meee 
21g = AI 


Now in theorem I take K-= 2 =M; A= {i =p) f; Contrary to theorém 2; the range ===] 


a 8) Gs K-=1~Misall of X; then we can take t= Ar so close to-* that [At =k=tHis 
small enough for (K — A)x = u to have a solution x whose norm satisfies (3). This 
is in direct contradiction to relation (7), which asserts that the unique solution of 


be hee te wid fae bia eG ee ` “(KANE = Œ =M) = li 
has a norm |x| that is very large compared to |u|. o 


Theorem 3. Let M be a bounded linear map: X —> X, M' : X' —> X' its transpose. 
Then 


oM’) =0 (M). (8) 


Proof. The proof is based on this simple observation: 
K : X — X is invertible iff its transpose K’ : X' — X’ is invertible. Now, if K is 
invertible, it has an inverse L: 


KL = LK =I. (9) 
The transpose of these relations is 
L'K' =K’ =I, (9°) 


which shows that L’ is the inverse of K’. When X is reflexive, the relation of K 
and K’ is symmetric, so the proof is complete. In the nonreflexive case an additional 
argument is needed. Suppose that K’ is invertible; then (9’) holds, and so by transpo- 
sition 


K”L” pam LK” = T. (9") 


Since K” and I”, when restricted to X, are K and I, respectively, it follows from (9”) 
that the nullspace of K is trivial. It follows that K is one-to-one, and by (9”) that L” 
restricted to the range of K is inverse to K. Since L” is a continuous map, it follows 
that the range of K is closed. We claim that the range of K is all of X, for if not, 
there would be, by Hahn-Banach, a linear functional £ 4 0 annihilating Rx, but such 
an £, according to theorem 5 of chapter 15, belongs to the nullspace of K’. Since K’ 
is assumed invertible, this cannot be. 

Theorem 3 follows from the observation by setting K = A — M. . O 


SHIFTS 


id 
nv 
eo 


Corollary 1. Let M be a bounded linear map of a Hilbert space H into H, M* its 
adjoint. Then 


o (M*) = 0o M). 
Exercise 1. Prove corollary 3. 


The discussion at the beginning of this chapter shows that à can enter the spectrum 
of a map M in two ways: 


(a) (AI — M) has a nontrivial nullspace in X. 
(b) The range of (AI — M) is a proper subspace of X. 


A nonzero vector in the nullspace of AI—M is called an eigenvector of M. Since in 
an infinite-dimensional space alternative (b) could hold without (a) being the case, a 
point A in the spectrum of M need not be an eigenvalue of M. This is the reason why 
eigenvectors play a less prominent role in the general theory of linear operators than 
in the finite-dimensional case. They do play a role, as we will see in the examples 
below, and in the chapters on compact operators. 

We turn now to some examples: 


20.2 SHIFTS 


Take X to be £, Gieta of vectors x with complex components: 
x = (ag, q,...) $ lajl? < o. (10) 
The right shift R and left shift L are defined by 
Rx = (0,49, a],...,), Lx = (a;i, a2, ...). (11) 


Clearly, LR = I, RL # I, so neither L nor R is invertible. A moment’s calculation 
shows that R and L are transposes of each other: 
R'=L, R=, (12) 


__ Theorem 4. The spectrum of R and of L consists of the unit disk |A| < 1. 


Proof. Obviously L is a contraction, that is, |Lx|] < ||x||. Since equality holds = 


for some x, it follows that |L]| = 1, and similarly that-for any positive integer n, 
IL” || = 1. It follows from this that 


lim JL" |!" = 1, (13) 


so according to theorem 4 of chapter 17, the spectral radius of L equals 1. This shows 
that no complex number ¢,|¢| > 1, belongs to the spectrum of L. 


aiy 


- 230 EXAMPLES OF OPERATORS AND THEIR SPECTRA 


Next we determine the eigenvalues and eigenvectors of L. Suppose that-Lx = Ax; 
by definition (11) this means that (a), a2, a3, -..) = A(@0, 4], 42, . . -), Which implies 
that Erneta Binns Gen cote, TORR 


PEA (14) 


Since x belongs to £7, Y Jan]? < 00; this is satisfied by (14) iff A A So we 
“see that the eigenvalues of L are all complex numbers 4, |A|/-= 1: All eigenvalues x 
belong to ø (L); since the spectrum is closed, all àin the unit disk belong to o (L). 
Since R and L are adjoint to each other,-it follows from theorem 3 that o(L) = 
a(R); from this and the e previous two Statements theorem 4 follows. _ "O 


Exercise 2. Show that R has no ei genvalues. 


Exercise 3. Show that the spectrum of R and L acting on the spaces, £P, 1 < p < 
oo, consists of all points of the unit disk. 


20.3 VOLTERRA INTEGRAL OPERATORS 


Take X to be C[0, 1], and V to be integration: 
§ 
(Vx)(s) = f x(r) dr. (15) 
0 


Theorem 5. The spectrum of the operator (15) acting on C[0, 1] consists of the 
single point à = 0. 


Proof. The n-fold iterate of V is given by the formula 


(V"x)(s) = oof (s =r)! x(r) dr, (15% 


as may be verified by induction, using integration by parts. For any s in [0, 1] 


if Ixl 
n ES pas e 
awl ea nif (s—r)"xldr < =. 


So for any x in C(O, 1], [V"x| < |x|/n!, which implies, by definition of norm, that 
[V|" < 1/n!. It follows from this that 


lim py”! =0 


n> OO 


and so by theorem 4 of chapter 17, the spectral radius of V is zero. Since the spec- 
trum is nonempty, theorem 5 follows. m) 


-= REplacing.x by —x gives 


THE FOURIER TRANSFORM 231 


Example 1. X = 7; {An} a given bounded sequence of complex numbers. For x 
given by (10), define 


Mx = (Agao, Ajay, ---,Andn) ++. (16) 


Exercise 4. Prove that the spectrum of M as defined by (16) is the closure of the set 
{An } $ 


Exercise 5. Take X to be £?, | < p < œœ, and define M by (16). Prove that o (M) 
is the closure of the set {Àn}. 


Exercise 6. Suppose that the kernel of the integral operator 
AY 
Kf(x)a >i K(s,t) f(t) dt (17) 
0 
is a continuous function of s, żin t < s. 


(a) Show that K maps C[0, 1] into C(O, 1). 
(b) Show that the spectrum of K consists of the single point 0. 


Operators of form (17) are called Volterra operators, after Vito Volterra, who first 
investigated their theory. 


20.4 THE FOURIER TRANSFORM 
Denote the Fourier transform by F: 
1 ax x 
(F/)@ = Wor l foe "dx = f(u). 


We take it as known from the theory of the Fourier transform, see Appendix B, that 
F is an invertible, norm-preserving map of L? (R) onto L? (R). The inverse is given 


eee eae 


f= -= J Fe du. 


f(-x) = -= J Fuel du. 


Denote the mapping f(x) > f(—x) by R, it follows from the formula above that 


——-F2-=-R-Since R?-= I, it follows that 


Ft =L 


232 EXAMPLES OF OPERATORS AND THEIR SPECTRA. 


It follows then from the spectral mapping theorem that the spectrum of F lies on the 
set consisting of the fourth roots of 1 : + 1, +i. 


Exercise 7. 


"(a)" Show that F maps the space of functions of the form p(x)e~* i 2? pa polyno- 
<== = . - mial of degree < n, into itself. 
—————(b) Show that F-has eigenfunctions of form p(x)e T% /2, eee oe oe e emm 
(c) Show that the eigenfunctions in (b) span all of L? (R). (Hint: See section 9.1). 


BIBLIOGRAPHY 


Volterra, V. Sulla inversione degli integrali definiti. An. Mat. (2), 25 (1897): 139-178. 


21 


COMPACT MAPS 


The notion of compact maps, and their properties, are the bread and butter of func- 
tional analysis. 

We recall that a subset $ of a complete metric space is called precompact if its 
closure is compact. The following are useful criteria for precompactness: 


(a) S is precompact if and only if every sequence of points of S contains a Cauchy 
subsequence. 


(b) S is precompact iff for every € > 0 it can be covered by a finite number of 
balls of radius €. 
We turn to precompact subsets of Banach spaces. The following are easily 
deduced from (a) or (b). 


(c) If Cy and C2 are precompact subsets of a Banach space X, then C; + C2 is 
precompact. 


(d) If C is a precompact set in a Banach space, so is its convex hull. 


(e) If C is a precompact subset of a Banach space X, M a linear, bounded map of 
X into another Banach space U, then MC is a precompact subset of U. 


Exercise J. Prove statements (c), (d) and (e). 


21.1 BASIC PROPERTIES OF COMPACT MAPS 


Definition. X and U denote Banach spaces. A linear map C : X — U is called 


compact if the image C B of the unit ball B in X is precompact in U. 


- -Theorem 1. 


(i) The sum of two compact maps: X — U is compact. 
(ii) The scalar multiple of a compact map is compact. _ 


(iii) Let V be a Banach space, M : U — V a bounded linear map, C : X — U 


compact. Then the product MC : X — V is compact. 


233 


234 COMPACT MAPS 


(iv) Let Z denote a Banach space, N : Z —> X a bounded linear map, c LX U 
compact. Then CN : Z — U is compact. 
. (x) Let Cn : X —> U be a sequence of compact maps that converge uniformly 


to C, 


JimC, — C| = 0. (1) 


aiamaa hen Cis-eompact. =.. e 


Proof. Denote by Ci, Co two compact linear maps of X -> U; that means that 
the images C}B = C and CoB = Ch of the unit ball B in X are precompact. 


_According to (c) above, C; + C2 is then precompact. Since (C; + C2) B is contained 


in-€7B-+-C7Band-sinee-a-subset of a-precompact set-is precompact,-part(i)-follows:~: 


Part (ii) is a special case of (iii). Part (iii). in turn follows from property (e) above 
of precompact sets. 

(iv) Since the bounded map N carries the unit ball of Z into some ball in X, CN B 
is precompact. 

(v) Given any € > 0, choose n so large that |C, — C| < e€. Since C, is a compact 
map, CnB can be covered by a finite number of balls of radius €; but then C B is 
covered by balls of radius 2¢ around the same centers. 0 


Suppose that U = X; in the language of algebra, theorem 1 says that the compact 
maps form a closed two-sided ideal in £(X). According to a theorem of Calkin, this 
is the only closed two-sided ideal when X is a Hilbert space. 


Theorem 2. X and U are Banach spaces, C : X — U a compact linear map. Let Y 
be a closed subspace of X, and V the closure in U of CY. 


(i) The restriction of C to Y —> V is a compact map. 


(ii) Suppose that U = X, and the closed subspace Y is invariant under C, namely 
is mapped into itself by C. Then C : X/Y —> X/Y is compact. 


Proof. Part (i) is utterly obvious; so is part (ii), once the definitions of the norms 
in the quotient spaces are put in place. 


Exercise 2. Prove that a degenerate bounded linear map D(dim Rp < oo) is com- 
pact. 


In the rest of this chapter we present a theory of compact operators due to F. Riesz. 
We start by restating lemma 7 from chapter 5, on the geometry of normed linear 
spaces. This lemma will be used over and over again: 


Lemma 3. X is a normed linear space, Y a closed linear subspace of X properly 
contained in X. Then there is an x in X such that 


jx| = 1 and d(x,Y)= inf Ix -—y] > 3. (2) 
yiny 


un 


BASIC PROPERTIES OF COMPACT MAPS 23 


In chapter 5 this lemma was used to prove theorem 6, which asserts that the unit 
ball in a normed linear space is compact iff the space is finite dimensional. Here we 
use it to study of compact maps C of a Banach space X into itself. The next three 
theorems are the basic results concerning these maps. 


Theorem 4. Let C be a compact map of a Banach space X —> X; denote by the 
identity map X —> X; set 


T=I-C. (3) 
oe (i) The nullspace Nr of T is finite-dimensional. 
ee (ii) Denote by Nj the nullspace of TÍ, 
a Nj = Nyi. (4) 
i There is an integer i such that 
= N; =N; fork >i. (5) 


(iii) The range Ry of T is closed. 


Proof. (i) By definition (3) of T, y lies in Ny if y = C y. Since C is assumed 
compact, it follows that the unit ball in Ny is precompact. But then according to the 
result quoted above, Nr is finite-dimensional. 

(ii) Assume, on the contrary, that (5) fails for all i, that is, that N;_; is a proper 
subset of N; for alli. By lemma 3, there would be for every i a vector y; such that 


yiinN;, lyl=1, dOn Ni) > 4 (6) 
Take m < n; by definition (3) of T, 
Cyn — Cym = Yn — T yn — Yn + Tym. a (7) 


The last three terms on the right belong to N,— , so, by (6), their sum differs from 
Yn by 4 at least. This proves that |C yp — C ym| > a Clearly, the sequence {C yy} 


contains no Cauchy subsequence. Since‘each | y| = 1, this contradicts compactness 
of C. ie ee 


(iii) We have to show that if {yg} is a convergent sequence of points in Rr, 


lim yk = y, ye=T Xp, (8) 


then their limit y also. belongs to Ry. Denote. by. dg..the distance. ofxg-from. Neopian nee 


vdp = inf [xg pat, zh: + Fase ring as tated sla on (9) 
zin Nr 

We claim that the sequence dg is bounded. Indeed, we can choose zę in Nr so that 
Wk = Xk — Zp satisfies 


[we] = |xk — el < 2d. (9°) 


236 COMPACT MAPS 


a Tu = Tk = Ezk = yko (10) 


Suppose that the dg were unbounded. Since it follows from (8) that the |y,| are 
bounded, we divide (10) by d; and conclude that . 


wood SURES maei UD P passe anann er 


| Eee ae ERRA ae 
a ea a ; (11) 


Set uz = wg/dp. It follows from (9’) that |u,| <2: Using (11).and the. definition 
(3) of T, we.see that.u, — C uj..-%..0..Since Ciis compact, the second term has a 


“Tu mconvergent-subsequence;-but ther so does the first term: 
uk — u. l (12) 


Since T is continuous, it follows from (11) that the limit u -satisfies lim Tu, = Tu = 
0, that is, u belongs to Nr. On the other hand, it follows from (9) that Jwg — z| > d 
for all z in Nr. Dividing by dg and using ug = wg/dg, we deduce that Jug — z| > 1 
for all z in Ny. Since we may take z = u, this contradicts (12) and shows that the 
sequence d; is bounded. 

Using the definition (3) of T, we deduce from (10) and (8) that 


wk = C wg = yk > Y. (13) 


It follows from (9’) and the boundedness of d; that the sequence wy is bounded. 
Then by the compactness of C, the second term on the left in (13) has a convergent 
subsequence. It follows from (13) that then the same subsequence of wg converges 
to a limit w, and since T is continuous, that w satisfies 


w-Cw=Tw=y. 
This proves that the range of T is closed, as asserted. l oO 


The next result states that for C compact the range of T = I — C has finite 
codimension, equal to the dimension of the nullspace of T. Recalling from chapter 2 
the notion of the index of a map as the difference of the two, we state the result as 
follows: 


` Theorem 5. Let C be a compact map of a Banach space X —> X. Then T = I — C 
satisfies : 


ind T = dim Ny — codim Rr = 0. (14) 
Proof. We start with the special case that Ny is trivial: 


dim Ny = 0. (15) 


BASIC PROPERTIES OF COMPACT MAPS 237 


We show that then codim Ry = 0, which means that Ry is the whole space X. Now 
suppose, on the contrary, that Ry = X is a proper subspace of X. Then, since by 
assumption (15), T is one-to-one, it follows that TX; = X3 is a proper subspace of 
Xı. Define Xg as T* X. We deduce similarly that X D X; D X2 D ++, and that all 
inclusions are proper. 

According to part (iii) of theorem 4, X4, the range of T, is closed. We claim that 
every subspace X, is closed. Indeed X% is the range of T*, and T* = a-c = 
I+$ (1) (i)C/ is, according to theorem 1, of the form I plus a compact operator. 
Therefore we can conclude from theorem 4 that X% is closed. We appeal now to 
lemma 3; we can choose x, in Xg so that 


xe] = 1, dist (xg, Xk+1) > 4. (16) 
Let m and n be two distinct indices, m < n. Then, using the definition (3) of T, 
Cxm — C xn = Xm — Txm — xn +T xn. 


The last three terms on the right all belong to X,,4 1; therefore, by (16), |C xm — 
Cxn| > 4. This contradicts the assumption that C maps the unit ball into a precom- 
pact set, and completes the proof of (14) under the assumption (15). 

We turn now to the case when T has a nontrivial nullspace; according to theorem 4 
there is an index i such that 


Niy = Ni, (17) 


where N; is defined by formula (4). It follows from that definition that N = N; 
is an invariant subspace of T, and therefore also of C. So we may apply part (ii) of 
theorem 2 and conclude that C : X/N — X/N is acompact mapping. We claim that 
T: X/N —> X/N has trivial nullspace; for a nontrivial nullvector would imply that 
for some x not in N is mapped into N. Since N is the nullspace of TŻ, this would put 
x into Nj1, the nullspace of T'+!. This of course contradicts (17). Thus T on X/N 
satisfies assumption (15); it follows then that T maps X/N one-to-one onto itself. 
This means that for any y in X there is an x in X, z in N such that 


Tx =y+ez. (18) 


We can express this as 


X=Ry+N, _ 49) 


meaning that every y in X can be expressed as the sum of a vector in Ry and of 
one in N. These spaces have, fori > 1, a nonempty intersection, consisting of those 
vectors 7 in N that lie in the range of T; that is, the vectors n are of the form 


n = Tz. (20) 


It follows from (17) that z in (20) belongs to N. According to a basic theorem of 
linear algebra, the dimension of the nullspace of T in N equals the codimension of 


= 


238 : COMPACT MAPS 


the range of T in N. It follows that the-dimension of the space of vectors of form 
(20) is 7 : 


dim N — dim Ne=- ee (21) 


Combining this with (19) shows that the codimension of Ry in X is equal to dim NT, 


as asserted in (14). m 


21.2 THE SPECTRAL THEORY OF COMPACT MAPS 


Theorem 6 (F. Riesz). X denotes a Banach space, C a compact linear :map-of- -- 


X > xX: SENS aiai ee ani SN FEINES 


(i) The spectrum of C consists of an at most denumerable set of complex numbers 
{An} that accumulate only at 0. If dim X = œ, 0 belongs to o (C). 

(ii) Each nonzero Àj is a point eigenvalue of C, of finite multiplicity; that is, for 
eachh = Àj 


the nullspace of C — d is finite-dimensional, 


there is an integer i such that the nullspace of (C — d)* is the same as the 
nullspace of (C — 2)! for all k > i. 


(iii) The resolvent (¢€ — C)! has a pole at each nonzero À j. 


Proof, Define T for ¢ 4 0, as T = 1—¢7'C. It follows from theorem 5 that if the 
nullspace of T is trivial, meaning that dim Nr = 0, then the range of T is all of X. 
This shows that every nonzero point of the spectrum of T is an eigenvalue. It follows 
from part (ii) of theorem 4 that the multiplicty of an eigenvalue is finite; this proves 
part (ii) of theorem 6. 

We.turn to (i). To.show.that.the eigenvalues àn of C can accumulate only at 0, 
consider an infinite sequence {Àn}, An Æ Am for n Æ m, of eigenvalues, with corre- 
sponding eigenvectors Xn: 


Cxn = AnXn. (22) 


Define Y, to be the linear space spanned by x, ..., Xn. Since eigenvectors pertaining 
to distinct eigenvalues are linearly independent, Yp—; is a proper subset of Y,. We 
apply now lemma 3, with X = Y, and Y = Y„—1; there is a yn in Yy such that 


lynl=1, lyn —yl> 4 forall yin ¥,-y. (23): 


By definition of Yn, Yn is of the form 


n 
Yn = > ajxj. 
1 


THE SPECTRAL THEORY OF COMPACT MAPS 239 


Thus 
n 
Cyn — Ann = X (Aj — Andajxy € Yat. (24) 
l 


This shows that for n > m, 
Cyn — C Ym = ÀnYn — Y, yin Yy—y. 
So by (23), 


lAn] 


IC yn — C Yml = =. (25) 


Since each yn is a unit vector, and since C is assumed to map the unit ball into a 
precompact set, there can be only a finite number of Àn with |A,| > ô. 

We can restate (25) in a quantitative form. Recall the definition of the capacity 
function C (e, K) of a precompact set K in a metric space: it is the maximum number 
of points z,,..., zc in K such that the distance of any two distinct zj is at least e: 


d(En, Zm) = €, nám. 


In terms of the function C the inequality (25) leads to the following estimate of the 
number N (e) of eigenvalues > e: 


N (e) < C(e, 2C(B), (26) 
where B is the unit ball in X. 
Exercise 3. Show that the factor 2 on the right in (26) can be omitted. 


(iii) To show that the resolvent of C has poles at Àj, take ¢ to be near but 4 À j 
By definition of the resolvent, (¢ — C)~!x = u means that 


Pet ed ea 2 Fey ae Soe are fed oi fat (27) 


We will solve this equation for u in two stages. Choose i so large that Nja, = Nj, 
where N; is Na-cy: and denote N; as N. Since N is an invariant subspace of C, C 


——__——can-be_interpreted_as_a-map- = a 


Cl X/N —> XIN. | (28) 


According to part (ii) of theorem 2, C in (28) is compact. We claim that A belongs 
to the resolvent set of C over X/N. If it did not, A would, by part (ii), belong to the 
__ point spectrum of C over X/N; that means that some point y in X, = 0 mod N, 


would be mapped by C into N. But such a y would belong to Nj4; and not to N;, 
contrary to our choice of i. It follows then, since C is compact over X/N, that à —C 


“x3 


240 COMPACT MAPS 


is invertible on X/N. Since the collection of invertible maps is open, it follows that 
Se E apeicienty; small aud that 


Ae ©] < < const. (29) 


ae a > i ate edt 


<- The first step in solving (27) is to solve the congruence, 


tu—Cv =x mod N; E 60) 


according to (29), (30) has a Vnique solution v, and luls < const. |x|. The congruence 
————-(30)-means_that 


tu-Cv=x-—n, neN. (31) 
The second step is to find a solution z in N of the equation 
ġz— Cz =n. (32) 


Adding (32) to (31), we obtain u = v + z as the solution of (27). 

Solving (32) for z in N is a problem in linear algebra. By definition of A as an 
eigenvalue of C, we know that à belongs to the spectrum of C over N. Since N is 
finite dimensional, it follows that no ¢ sufficiently close to A and Æ A belongs to 
the spectrum of C over N. Thus (32) has a unique solution for such ¢, and since 
the resolvent over a finite-dimensional space is a rational function, that solution z 
satisfies |z| < const.|¢ — A|} in]. It follows from (31) that In| < const. |x|, so 
z| < const. |¢ — A[~! |x]. Combining this with |v] < const. |x|, we get 


= ful = 1G — Ox] = fu + zl < lol Izl < const. |g — af“ x. 
This inequality can be expressed so: for ¢ near enough to A but Æ A, 
\(¢ —C)7!| < const. j¢ — aja! 
a In words, near an eigenvalue À of a compact map C, the resolvent blows up at most 
5 like the minus ith power of the distance to A. From this it is easy to deduce, as in 
classical function theory, that the resolvent of C has a pole of order i. This completes 
the proof of theorem 6. oO 


Note that the proof of part (iii) gives another proof of part (ii). 


Exercise 4. Show that the resolvent of C has a Laurent expansion around A of the 
form 


00 
t-07 = A-a. (33) 


THE SPECTRAL THEORY OF COMPACT MAPS 241 


Show that A_, is a projection, namely that AŽ; = Aci ; see theorem 6 of chapter 17. 
Show that the range of this projection is N;; show that 


A; =(C-ay tA), j=2,.. i (34) 


Exercise 5. Show that a compact operator on a Banach space X, dim X = œ, is not 
invertible. 


In what follows, X denotes a Banach space, and B a bounded linear operator: 
X — X such that for some integer n 


B” =C (35) 


is a compact operator. The basic properties of compact operators, theorems 4, 5, 
and 6, are true for operators whose power is compact. 


Theorem 4’. Let B : X —> X be a map with compact power; set 
S=I-B. (36) 


(i) The nullspace Ng of S is finite-dimensional. 
(ii) There is an integer i such that 


Ng: = Noi forall k > i. (37) 
(iii) The range Rg of S is closed. 
Theorem 5’. For S as defined in (36). 
indS = dim Ng — codim Rg = 0. (38) 


Theorem 6’. B : X —> X a map with compact power. 


(i) The spectrum of B consists of an at most denumerable set of complex numbers 
{Bn} that accumulate only at 0. If dim X = œ, 0 belongs to o(B). 


(ii) Each B; is of finite multiplicity and finite index. 


(iit) The resolvent oj B has poles at Bj. 


meee Proof:--We-start with the identity 


I-B" = (I — B) +B +- +B"), (39) 


Denote I — B” = T, IF B+ B-t Q Then (39) can be rewritten 


T = SQ = QS. (40) 


242 COMPACT MAPS 


From this we deduce that p 


NT DNs, _ Rr C Rs, ae Ot = (41) 


By assumption B” = C is compact; it follows from theorems 4 and 5 that 


_dim Ny <., 


codimRr<00. (42) 


Combining (42) and (41), it follows that dim Ng < ©, codim Rg < O0. By theo- 


rem 4, Rr is closed; from this and (41) it follows that so is Rg. 
Raising (40) to the kth power gives 


WLLL ea Soos ee 
from which we deduce that É 
Ny D Nek, Ryk CG Rg. (43) 


By assumption, B” = C is compact, by part (ii) of theorem 4, Nyx is independent 
of k for k > i. It follows then from (42) that the nullspaces Ng, k = 1, 2, ... are all 
contained in the finite-dimensional space Ny. From this it follows that the spaces 
Ng, ordered by inclusion, are independent of k for k large enough. This completes 
the proof of theorem 4’. o 


We turn now to theorem 6’. According to the spectral mapping theorem (see the- 
orem 5 of chapter 17), and since C = B”, 


o(C) = o(B") = o (B)”. (44) 


By assumption, C is compact; so according to theorem 6, o (C) consists of a denu- 
merable set of points accumulating only at 0. It follows from (44) that the same is 
true of the spectrum of B; this proves (i). 


Exercise 6. Estimate the number of eigenvalues £ of B that satisfy |B] > € in terms 
of the capacity C (€, B” B). 


We have already proved part (ii) of theorem 6'; we leave part (iii) as the next 
exercise. 


Exercise 7. Prove part (iii) of theorem 6’. Oo 


We turn now to theorem 5’. If the operator Q defined in (39) is invertible, then 
it follows from the factorizations (40) that the nullspaces of T and S have the same 
dimension, and their ranges have the same codimension. Thus ind T = ind S; since 
according to theorem 5, ind T is zero, ind S = 0 also, as asserted by theorem 5’. 

- To ascertain that Q is invertible, we note that Q is a polynomial in B, and we 
appeal to the spectral mapping theorem, theorem 4 of chapter 17. According to it, 


THE SPECTRAL THEORY OF COMPACT MAPS 243 
the spectrum of Q consists of complex numbers of the form 1 + 8 +... + B", Bin 
the spectrum of B. It follows that if o(B) contains no nth root of unity other than 


1, Q is invertible. If ø (B) does contain nth roots of unity, we perturb slightly the 
product (40). o 


Exercise 8. Show by an example that if M is the strong limit of a sequence of com- 
pact operators, M need not be compact. 


Exercise 9. Show that if C is compact and {M,,} tends strongly to M, then CM, and 
M,C tend uniformly to CM and MC, respectively. 


Theorem 7 (Schauder). The transpose C’ of a compact map C : X —> U iş com- 
pact, and conversely. 


Proof. We have to show that the image of the unit ball in U’ under C’ ig pre- 
compact. According to criterion a for prcompactness, given any sequence (€,.} in U, 
[€n| < L, we have to show that (C’2,} has a Cauchy subsequence. Denote by K the 
closure of CB, B the unit ball in X. Since C is assumed compact, K is a compact 


subset of U. The functions (£n, 1) of u are uniformly bounded and are equicontinu- 
ous on K: 


|En: 4) — (En, v)| = Wen, u — v)| < fu — v]. 


According to the theorem of Arzela-Ascoli, a uniformly bounded, equicontinuous 
sequence of functions on a compact set K has a uniformly convergent subsequence: 


(En, u) — Em, u)| < € (45) 


forall n,m > N, for all u in K. Since every u of the form u = Cx, |x| < 1, belongs 
to K, we conclude from (45) that 


[ln — Em, Cx)| = IC ln T Cln, x)| <e l (46) 


for all x, |x| < 1. By definition of the norm in U’, this proves that 
[C’ln — C’lm| < € (47) 


forn, m > N, namely that (C’é,} is a Cauchy sequence. 


~ -- Conversely, if-C! isa compact mapping,-then-by_what_we-have_proved-abeve-C-—_— 
is a compact map. Since C is the restriction of C” to X, it follows. from -part-(i)-@£-—~ =- 
theorem 2 that C is a compact map. This completes the proof of theorem7. = 


Theorem 8. Let C : X —> X be a compact map, and T = 1 — C., 


(i) A vector u belongs to the range of T iff (u, £) = 0 for every £ in the nullspace 
of T'. 


244 COMPACT MAPS 


Proof. The defining relation between T and its transpose is 
(Tx, = (x, TO. 


It follows that the nullspace of T' is the annihilator Rr of the range of T. 
(i) Since according to theorem 4, Ry is closed, it follows from theorem 8 of chapter 

8 that every vector u that satisfies (u, £) = 0 for all £ in Re belongs to. Race ee ee 

-~ ..... Gi) We saw in chapter 8, exercise 2, that for a closed subspace Rof X. the annihi- 
lator of R is isomorphic with the dual of X/R. Therefore dim R+ = dim(X/R)! = 
dim X/R = codim R. Apply this to R= Ry. Since the annihilator of Ry is the 
nullspace of T’, dim Ny = codim Ry. Since, by theorem 5, indy = 0, we deduce 

“oo thatcodim Ry = dim Nr. It follows that dim Ny. =. dim Np“Theorem-8-is-catted ———— 
the Fredholm alternative. : 0 


Theorem 9. A compact map C : X —> U maps every weakly convergent sequence 
into one that converges strongly. 


Exercise 10. 


(i) Prove theorem 9. 
(ii) What about its converse? 


NOTE. F. Riesz’s paper was, and is, fundamental for the theory of compact oper- 
ators in a Banach space. There he took Hilbert’s definition of a compact operator, 
called in those days completely continuous, and showed how to extend this notion 
to Banach spaces. His accomplishment is all the more remarkable, since 1918 pre- 
dates Banach’s fundamental paper by a good five years! And unlike Banach, he treats 
normed linear spaces over the complex field, which is essential for spectral theory. 


HISTORICAL NOTE. Julius Schauder (1899-1943) was the most brilliant of the Pol- 
ish mathematicians of his time, Schauder bases, the Schauder fixed point theorem, 
the Leray-Schauder degree of a mapping, as well as many fundamental results in the 
theory of elliptic and hyperbolic partial differential equations, are his creation. Be- 
ing Jewish, he was killed during the Nazi occupation of Poland. Such things were so 
routine, nobody knows when or where. 


BIBLIOGRAPHY 


Calkin, J. W. Two-sided ideals and congruences in the ring of bounded operator in Hilbert space. An. 
Math. (2), 42 (1941): 839-873. 


Riesz, F. Uber lineare Funktionalgleichungen. Acta Math., 41 (1918): 71-98. 


Schauder, J. Uber lineare, vollstatige Funktionaloperationen. Studia Math. 2 (1930): 183-196. 


22 


EXAMPLES OF 
COMPACT OPERATORS 


22.1 COMPACTNESS CRITERIA 


We start by giving some useful criteria in various topologies for sets of functions to 
be compact. The first theorem is the famous Arzela-Ascoli criterion. 


Definition. S is a Hausdorff space; a collection {g} of complex-valued functions g 


on S is called equicontinuous if for every point s of S and every € > 0 there is an 
open set N containing s such that for every r in N and every g of the collection 


Ig(r) — g(s)| < €. D 


Theorem 1. S is a compact Hausdorff space, {g} a collection of complex-valued 
functions on S satisfying these conditions: 


(i) The collection {g} is equicontinuous. 
(ii) {g} is uniformly bounded: 


‘|g(s)l < M (1’) 


for all s in S and every g in the collection. 


For a proof, we refer to any text on real variables, such as that by Royden. 


Exercise I. Show that conditions (i) and (ii) of theorem 1 are necessary for a collec- 
tion to be precompact in the maximum norm. 


Exercise 2. Show that if Q is an open bounded set in R” such that any two points of 
Q can be connected by a path of length < £, then a family of functions {g} on Q is 


245 


Assertion. The collection {g} is precompact.in.the.maximum normon. SS. —< — 


wee 


246 EXAMPLES OF COMPACT OPERATORS 


precompact in the maximum norm if the functions g and their first partial derivatives 
are uniformly bounded in Q: 


lei <M, (ði g| <M in Q. 


Exercise 3. Formulate a version of theorem 1 for functions whose values lie in a 
metric space. eet 


Another equally—or even more—important compactness criterion is due to 
Rellich: 


-——Fheorem 27-0 is a domain in R”; open-and-bounded, whose boundaryis smooth. ~~~ 
Suppose that {u} is a collection of functions in Q such that the functions u and their 
first derivatives are uniformly bounded in the L?2(Q) norm: 


lul < M, Hull < M, i=il,...,m. (2) 


Then the collection {u} is precompact in the L?(Q) norm. 


Proof of Rellich’s criterion is based on Poincaré’s inequality: for any smooth func- 
tion u defined over a smoothly bounded domain Q, 


j 2 
T | ju(x)| dx < |f rax TJ) \a;u|> dx, (3) 
Q Q 


where d is the diameter of Q. For details we refer to Courant-Hilbert. 
We remark that Poincaré’s inequality, and likewise Rellich’s criterion, do not hold 
for arbitrary bounded domains. 


22.2 INTEGRAL OPERATORS 


We turn now to integral operators, see equation (1), chapter 16, where we have de- 
fined g = Kf by 


g(s) = [ K(s,t) f(t) dn(t), (4) 


s € §,t € T, both compact metric spaces. We will investigate conditions on the 
kernel K (s, t) that make K compact in various topologies. 


Theorem 3. The integral operator (4) is compact as a map L! —> C if the kernel 
K(s, 1) is a continuous function of s and t. 


Proof. To show that K a compact operator with respect to these norms, we have 
to verify that the functions 


g=Kyf, Ifl <1, (5) 


INTEGRAL OPERATORS 247 


satisfy the compactness criteria (1), (1’). Since S and T are compact spaces, the kernel 
is uniformly bounded, so by theorem 1 of chapter 16, K is bounded; thus condition 
(1’) is satisfied. To verify (1), we study g(r) — g(s): 


le) — g(s)| = fixo, t) — K (s, t)] f(t) dn| < SUDO H) = KE, DI |f zo. 
(6) 


Since T x S is compact, the kernel is uniformly continuous, so the right side of (6) 
tends to zero as r tends to s. This shows that the functions g in (5) are equicontinuous. 


o 


Using an analogous argument we can prove 


Theorem 3’. The integral operator (4) is compact as a map C —> C if the kernel 
K(s, £) is a continuous function of s in the L! norm with respect to t. 


Next we recall theorem 2 of chapter 16, which says that an integral operator K 
whose kernel K is square integrable is a bounded mapping: L? —> L?, and that 


IKI? < f |K (s, t)|? dm dn. (7) 


Condition (7) implies more: 


Theorem 4. An integral operator with a square integrable kernel is a compact map 


of L? > L. 


Proof. Let {u i} be an orthonormal basis of L7(S,dm). We expand K(s,t) for 
fixed t into a series 


K(s,t) = Kj(@)uj(s). (8) 
i j 
Using the Parseval relation for {u j} gives for almost all values of t, sain ao 
fixe D? dm(s) = J |KO. (9) 
j 


-Integrating (9) gives 


o ffr PP dman = seyfi (t)|? dn(t). (10) 


Now define 


Kns) = > Kj()ujs), o a) 


ISN 


248 ‘ ; EXAMPLES OF COMPACT OPERATORS 


and denote by Ky the integral operator with kernel Ky. Clearly; Ky isa degenerate 

__. Operator, meaning its range is finite dimensional; then, as observed in exercise 2 of 

. chapter 21, Ky is compact. We claim that as N — 00, Ky tends in norm to K. For 
applying inequality (7) to K — Ky, we get 


eK KV II? =ffix = Kyl? dmdn = >D J K? (t) dn(t). (12) 


See in ET at AR Ot : J>N 


It follows from (10) that the right side of (12) tends to 0 as N > oo. As noted in- 
theorem 1, part (v), chapter 21, the uniform limit of compact maps is‘compact. O 


———Exereise-4—€onstruct-an-example-of-an-integral-operator-whose ‘kernel-satisfies-~- 
Holmgren’s condition (see theorem 3 of chapter 16) that is not compact. 


In chapter 20, theorem 4, we have shown that the operation of integration, defined 
as - 


(Vx)(s) = [xo dt, (13) 
0 


regarded as mapping of C[0, 1] to C[0, 1], has the spectrum consisting of 0. Here we 
give another proof of this fact. The kernel of the integral operator V in (15) is 


1 fort <s 


0 fot>s. ae 


Kon) = | 


Clearly, T is a continuous function of s in the L! norm in t. So according to theo- 
rem 3’, V is a compact map of C[0, 1] into itself. 
According to theorem 5, chapter 21, the spectrum of a compact operator consists 
of 0 and eigenvalues A + 0. We show now that V has no eigenvalues, for suppose 
_ that x were an eigenfunction, A an eigenvalue Æ 0, 


Vx = [xo dt =2x(s). (15) 
0 


The left side is a differentiable function of s, therefore so is the right. Differentiating 
(15), we get 


x(s) = àx’ (s). 
All nonzero solutions of this are of the form x(s) = ce’/+, c Æ 0. In particular, 
x(0) = c + 0; setting s = 0 in (15), we get 0 = Ax(0), in contradiction to what we 


have shown above. This proves that V has no eigenvalue A Æ 0, and so 


a(V) = {0}. (16) 
o 


THE INVERSE OF ELLIPTIC PARTIAL DIFFERENTIAL OPERATORS 249 


22.3 THE INVERSE OF ELLIPTIC PARTIAL 
DIFFERENTIAL OPERATORS 


We turn now to a class of operators that are defined by solving differential equations. 
Let Q.be a bounded domain in R” with a smooth boundary; denote by A the Laplace 
operator. It is well known, see section 7.2, that the boundary value problem 


Au=finQ, u=O0ondad, (17) 
has a unique solution u for every f in C (Q). 


Definition. Denote the solution u of (17) as 
u=Sf. (18) 
Theorem 5. S is a compact map of L? (Q) into L?(Q). 


Proof. Multiply (17) by u, and integrate over Q, integrating by parts: 


- J DiiajuPas = f fuas: (19) 


As noted in lemma | of chapter 7, for all functions u that vanish on 3Ọ, 


lullo < djl, (20) 


where ||z||9 denotes the L?-norm of u over Q, Nell? the sum of the squares of the 


L?-norms of the first derivatives of u and d the diameter of Q. Using the Schwarz 
inequality on the right of (19), and then (20), we get 


lel} < If lolulo < ail fllollel. 
SO 
lali < dll fllo, Iulo < dif Ilo. 21) 


” The image of the unit ball in L?(Q) under S consists of solutions u of (17) corre- 
-—sponding.to-f-with. || f-llo-s-1--It follows-from-(21)-that.these-satisfy 


luli <d, lullo <d. oo 
According to theorem 2, a set of functions satisfying (21’) is precompact in the 
L?(Q) norm. This proves that S is a compact operator. oO 


Exercise 5. Show that theorem 5 is true even when Q is not smoothly bounded. 
(Hint: Use lemma 2 in chapter 7.) 


@: 


250 EXAMPLES OF COMPACT OPERATORS 


Theorem 5 can be extended with practically no change to all: second order elliptic 
operators with Dirichlet boundary condition. Under the Neumann boundary condi- 


... . tion uy, = 0 it is necessary..to require Q to be smoothly bounded. 


22.4 OPERATORS DEFINED BY PARABOLIC: EQUATIONS -- ~- 


uy = Au (22) 


ees : —for-functions wets eQ-as inthe previous example;and+>-0-Itis- well nown;— —— 

and will be proved in the chapter on semigroups, that the initial boundary value 
problem for the parabolic equation has a unique solution; that is, there is a unique 
solution of (23) satisfying 

u(s, 0) given in Q, u =0on 30 forallt > 0. (23') 
We denote by S(T) the operator relating u(s, 0) to u(s, T), t > 0: 

S(T) : u(s, 0) — u(s, T). (23”) 

Theorem 6. S(T) : L2(Q) + L? (QY is a compact operator for T > Q. 


Proof. Multiply equation (23) by u, integrate with respect to s over Q, with re- 
spect to t from 0 to T. We get, after integration by parts, that 


] 2 T T 2 
zf ds|, a. [vswasar == |f Zor dsdt<0. (24) 


This provës that 


fès, T)ds < f sas, 


namely that |u (t) llo is a decreasing function of t. In terms of the solution operator S 
defined in (23”) this can be expressed as follows: 


IS(7)I| < 1. (25) 


Next multiply (23) by tAu and integrate with respect to s over Q and with respect to 
t from 0 to T. We get 


T T 
f fin Au dsdt =/ [cauas dt. (26) 
0 0 


ALMOST ORTHOGONAL BASES 251 


Integrate by parts on the left with respect to x and t; using the abbreviation u j for 
the partial derivatives of u, we get the following expression for the left side of (26): 


T l 
-5 [Camas 5 [f| E hdtas 


Since the right side of (26) is positive, we deduce that 


wh 1 
=f Dimas < 5 [ff Dgacat. 


We can use (24) to estimate the right side above; we obtain 


Tf DAT as < 5 | Oas 


Using Rellich’s compactness criterion, theorem 2, and exercise 6, we conclude that 
for T > 0, the image of f u*(0) ds < 1 under S(T) is precompact. 


22.5 ALMOST ORTHOGONAL BASES 


The following result due to Paley and Wiener shows that a not too large perturbation 
of an orthonormal basis is a basis: 


Theorem 7. Let H denote a Hilbert space, and {xn} an orthonormal basis of H. Let 
{yn} a collection of elements that doesn’t differ too much from {xn} in the sense that 


5 len — yrl? < 0. (27) 


We further assume that the {yn} are linearly independent in the sense that no Yn lies 
in the closed linear span of the other yr. 


Assertion. The {yn} form a basis in the sense that every u in H can be written 


~~ uniquely as a linear combination of {yp}. © ` 


Proof. (Birkhoff-Rota, and Sz. Nagy) Since the {x,} are an orthonormal basis, 
every u in H can be expanded as 


. 9 2 
saanane D aaa a= Gean), and fll? = Y Jaf. 


(28) 
Define the linear map B : H —> H as follows: for u given by (28), 
Bu = X anyn. (29) 


The series on the right converges; for write Bu — u = }_ an (yn — xn). Then, by the 
triangle inequality and Schwarz’s inequality, 


ea Ss “Busw = Sza On =a > = Gru Ryu: 


252 EXAMPLES OF COMPACT OPERATORS 


ee e | 
[Bu—ull < $ lanl lyn—tnll < (X lan?) (So ln = 2n?) S lull const. 
; ad a -~ - (30) 


where in the last step we have used assumption (27), and (28). This proves that B — I 
is a bounded linear map. We claim that B — I is a compact map; for write. . - 


EEEE D at ee = on panes ae 


BH) 


0 N+1 


We estimate Ryu as in 


== a 
pal = ol ($ va =at) : | (32) 


N+1 


It follows from (32) and (27) that 
li R = 0. 
N eae I N I 


This and the decomposition (31) show that B — I is the uniform limit of Gy. The 
mapping Gy are degenerate, therefore compact; so B — I is the uniform limit of 
compact maps. According to theorem 2 of chapter 21, B — I itself is compact. 

The nullspace of B is trivial; for if Bu were = 0 for some u # 0, by (29) 0 = 
> anyn would give a nontrivial linear relation among the yn, which is excluded by 
assumption (ii). We apply now theorem 5 of chapter 21 to B = I+ (B—J) to conclude 
that the range of B is all of H, as asserted in theorem 7. O 


BIBLIOGRAPHY 


Í Birkhoff, G. and Rota, G.-C. On the completeness of Sturm-Lionville expansions. Am. Math. Monthly, 67 
(1960): 835-841. 


Courant, R. and Hilbert, D. Methoden der Mathematischen Physik. Springer, Berlin, 1993; see p. 488. 


Paley, R. E. A. C. and Wiener, N. Fourier:Transforms in the Complex Domain. AMS Coll Publ., New 
York, 1934. 


Poincaré, H. Rend. Circ. Mat. Palermo (1894). 
Rellich, F. Ein Satz über mittlere Kowvergenz. Ges. Wiss. Gött., Nachrichten (1930). 
Sz.-Nagy, B. Expansion theorems of Paley-Wiener type. Duke Math. J., 14 (1947): 975-978. 


ix 


POSITIVE COMPACT 
OPERATORS 


23.1 THE SPECTRUM OF COMPACT POSITIVE OPERATORS 


A classic and important result of linear algebra, due to Perron (e.g., see Lax, p. 196) 
asserts that a matrix with all positive entries has a positive eigenvalue that is the 
largest in absolute value among all eigenvalues. In this chapter we present an infinite- 
dimensional generalization. 


Theorem 1. Let Q be a compact Hausdorff space, X = C(Q), K a linear map 


C(Q) — C(Q), mapping real-valued functions into real-valued functions. We as- 
sume that 


(i) K is strictly positive in the sense that if p is any nonnegative function on O, 
Æ 0, then Kp is positive on Q. 


(ii) K is compact. 


We claim that then 


K has a positive eigenvalue o of multiplicity one and index one, with positive 


eigenfunction. 


All other eigenvalues u of K are smaller in absolute value than o: 


tal<o- (1) 


Proof. The strict positivity of the operator K implies that K is strictly monotone, 


that is, i 


if x<y and xÆy, then Kx < Ky. (2) 


We consider now the action of K on nonnegative functions. We consider the set 
of all positive numbers « such that there is a nonnegative function x such that at all 


253 


254 POSITIVE COMPACT OPERATORS 


points of O, Eaa 
kx < Kx, O<x. (3) 


We denote by |x| the maximum norm of x and by |K| the corresponding norm of K. It 

is easy to see that the set of x is bounded, for (3) implies that «|x| < |Kx| < |K| |x], 

which shows that no « can exceed |K|. We show now that the set of « satisfying (3) 

is not empty. Indeed, take x(t) = 1; it follows from thé strict positivity of the K that" 

“Ky is positive Inequality (3) holds with x= min Kx: 
Next we use the monotonicity of K; combining (2) and (3), we get «Kx < K2x, 

which together with (3) gives «2x < K2x. Using this argument recursively we get 

that for any natural number 7 Seen ears 


Mae < K”x, (4) 
Since the function x is > 0, we deduce the norm inequality 
"|x| < [Kx] < |K"| |x], 
so 
kK” < |K"|. (5) 


We take the mth root and use the formula for the spectral radius (see theorem 4 of 
chapter 17): , 


k <lim|K"|!/" = |o(K)]. (6) 


Since the set of « is nonempty, it follows that K has a positive spectral radius. Since 
K is compact, it follows that the set of eigenvalues of K is not empty. 

Next we show the converse of inequality (6). Since, according to theorem 6 of 
chapter 21, the nonzero spectrum of K is an isolated point spectrum, there is an 
eigenvalue A and eigenfunction z such that 


Kz =3z, |A| = |o (K)I. (7) 
We claim the following inequality for |z(s)| = y(s), 0 = |o (K)|: 
o y < Ky. (8) 


Take any point q in the space Q; multiply the function z by a complex number 
of absolute value 1 so that Az(q) is real and positve. Decompose z as z = u + iv. 
Separating the real part of (7), we get 


42(q) = (Ku) (q). (9) 
Since the operator K is monotone, and since u < y, we deduce from (9) that 


lAly(@) = (Ku) (q) < (Ky)(q). (10) 


nN 
ur 
UL 


THE SPECTRUM OF COMPACT POSITIVE OPERATORS 


This proves (8),.and a little more; since K is strictly monotone, the sign of equality 
holds in (10) only if z is real and positive, and A is real and positive. 

We show now that in (8) the sign of equality holds. Indeed, suppose that for 
some q, 


oy(qg) < Ky(q). . (11) 


By continuity, if inequality (11) holds then and there is a positive ô such that in some 
neighborhood N around q 


oy(s)+6 < Ky(s), sinN, (12) 


We construct now a function p that is positive inside N, zero outside N. Since K is 
strictly positive, Kp > 0 everywhere. We define the function x by 


x=ytep, es, (13) 
We claim that for e small enough there is a constant c such that 
(o +ce)x < Kx. (14) 
We verify this fact first for s in N. Clearly, the right side of (14), Ky+¢Kp is > Ky, 
and the left side differs from Ay by O(e). It follows therefore from (12) that (14) 
holds in N for e small enough and c < 1. For s outside N, the function p(s) = 0, so 
inequality (14) asserts that 
(o +ce)y < Ky +eKp. 
In view of (8) this would follow from 
cy<Kp. oe (15) 


Since K is a strictly positive and p is nonnegative, it follows that Kp > 0 on Q. 


Clearly, c can be chosen so small that (15) holds. This completes the proof-of-(14).- —...—- 


Inequality (14) shows that if (12) were to hold, « = ø + ce and x = y+ep 
would satisfy in inequality (3). But according to (6), no such « can exceed ø; this is 
a contradiction, caused by assuming that in (8) the sign < holds at point. It follows 
that-in-(6)-equality-holds-everywhere—We showed earlier that then-the-cigenfunetion 


z of K with eigenvalue-A. whose.absolute. value-is-|o-(K)}-is-positive-when-multiplied. ..... ---. 


by a constant, and that À is real and positive. This proves part (ii) of theorem 1. 

We claim that o has multiplicity 1. For if not, K would have two linearly inde- 
pendent eigenfunctions. According to what was shown above, both of these can be 
chosen to be positive, but some linear combination of them would change sign, a 


contradiction. This proves the first part of (i) in theorem l=- =- .-—----. ie 


There remains to show that K has no generalized eigenfunctions sith area 
ues o. Denote by K’ the transpose of K; K’ acts on Borel measures of finite total 


G4 


256 E POSITIVE COMPACT OPERATORS 


` mass; it is related to K by 


(Kx, m) = (x, K’m). a6) 


It follows from (16) that K’ too is strictly positive; that is, that if m is a nonnegative 
measure Æ 0, then K/m is a positive measure. K’ and K have the same spectrum. 
Therefore K” has the same dominant eigenvalue o as K. Since K is compact, so 


~~ positive by the same argument as used above. — 
We are ready to show that K has no generalized eigenfunction with eigenvalue o, 
for suppose that w were such generalized eigenfunction: 


is K’. The eigen measure m associated with the eigenvalue o can be shown to be 
© re: i wea oe ne rrr a dapak cer arcs one ae ab pae mane ma aren. a amn e. — aes ee r 


K-ow=v, K-cw=0 A =N 
From (16) we deduce that 
(KE — o)w, m) = (w, (K' — o)m) 7) 
Since (K — o)w = v and (K’ — o )m = 0, we deduce from (17) that 
(v, m) =0. 


But since v is a positive function and m a positive measure, this is impossible. This 
completes the proof of theorem 1. l 0O 


23.2 STOCHASTIC INTEGRAL OPERATORS 


We consider now an application of theorem 1. 


Theorem 2. Let K (s, t) be continuous positive function on 0 < s,t < 1, satisfying 


1 
f K(s,t)ds =1 for allt. (18) 
0 
Denote by K the integral operator with kernel K; then 


(i) 1 is an eigenvalue of K, with corresponding eigenfunction y that is positive. 
(ii) x is any function satisfying 


[roa = (19) 
Then 


lim K”x = y (20) 


n> ox 


STOCHASTIC INTEGRAL OPERATORS 257 


provided that we normalize the eigenfunction y so that 
fowa= L. (21) 


Proof. We have shown in chapter 22 that an integral operator with a continu- 
ous kernel over a compact set such as [0, 1] is a compact operator. Since the kernel 
K(x, y) is positive, the operator K is strictly positive. It follows from the first part 
(ii) of theorem 1, that K has a positive eigenvalue o with positive eigenfunctions: 


J K(s,t) y(t) dt =a y(s). (22) 


Integrating this with respect to s and using (18), we get 


froa=o f yoa 


Since y(t) > 0, f y dt # 0, so it follows that o = 1. This proves the first part of (ii). 
To prove the second part, consider the space Z of all continuous functions z with 
mean value zero: . 


fa dt =0. (23) 


Z is a closed subspace of X = C[0, 1]. We claim that Z is invariant under K, that 
is, if z belongs to Z, so does u = Kz. To see this integrate both sides of 


us) = f Kis, at 


with respect to s. Using (18) gives, after changing the order of integration, 


fuoras= f f xeoxards= [aoar=o. 


Clearly, the spectrum of K over Z is a point spectrum consisting of the spectrum 
of K over X, with the eigenvalue o, = 1 removed, since o = 1 has multiplicity 1, 
and the corresponding eigenfunction y does not belong to Z. According to the first 
part of (ii), theorem 1, all the remaining eigenvalues y are < 1 in absolute value, and 


so the spectral radius 7 of K over Z, abbreviated as Ky, satisfies 


lo Ky) = dan n 
According to formula (12') in theorem 4 of chapter 1777- 


: nyl/n 
ju IKy| =i 


In particular, for any Z in Z, : i “pat go Wye se get ae 


limK"z = 0. (24) 


258 POSITIVE COMPACT OPERATORS 


We apply this now to z= x — y. Since x satisfies (19), and y is normalized by (21), 
z satisfies (23) and so belongs to Z. So (24) applies: 


lim K"(x — y) = 0. 
N-> 00 


-Since y is an eigenfunction, y = Ky = --. = K” y; this proves (17). o 


meee" Noté that the rate at which K"x tends to y depends on the size of the second 
largest eigenvalue 7 of K. Interesting estimates of the Second largest eigenvalue are 

derived in Lawler and Sokal. 
A nonnegative function x(t) that satisfies (19) can be interpreted as a probability 


_ density. The kernel Ķ (s.t) is the probability density of transition from state_tto .... . 


state s. Relation (18) means that with probability 1 an occupant of state £ makes a 
transition to some state s. The operator K then models a random process that chan ges 
a random variable whose distribution has density x into one whose distribution has 
density Kx. The eigenfunction y normalized by (21) is a probability density that is 
invariant under the random process described by K. 

Given a probability distribution x, K”x represents the probability distribution of 
the system after it has been subjected n times to the random process K. The prob- 
abilistic meaning of the limiting relation (17) is that the n-fold application of the 
random process K turns, as n — co, any distribution x into the invariant distribu- 
tion y. 


23.3 INVERSE OF A SECOND ORDER ELLIPTIC OPERATOR 


Theorem 3. Let L be a differential operator of form 
L=-A+) biði +e (25) 


acting on periodic functions of $), ... , Sm. Here 8; = 3/ðsi, A = D a?, bi, and c 
smooth periodic functions of s, c positive. Then for any smooth periodic function f, 
the equation 


Lu = f _ (26) 


has a unique smooth periodic solution u. We denote this solution as K f = u. The 
operator K regarded as a mapping of the space C of continuous, periodic functions 
into C has the following properties: 


(i) K is a bounded map. 
(ii) K is strictly positive. 
(iii) K is compact. 


Proof. We sketch those parts of the proof that do not require technicalities. Let 
Smax and Smin denote the points where the function u takes on its maximum and min- 


BIBLIOGRAPHY 259 


imum values, respectively. At such points the first derivatives of u are zero, and the 
second derivatives are nonpositive, respectively nonnegative. Therefore we conclude 
from equations (25) and (26) that 


C(Smax)Umax < f (Smax), 


C(Smin)lmin = f (Smin). (an) 


Since we have assumed that the function c is positive, we conclude that |u|max < 
const.| f |max, Which proves that K is bounded in the maximum norm, as asserted 
in (i). 

It follows from (27) that if f is positive for all s, then so is u. We omit the addi- 
tional argument to prove what is claimed in part (ii): if f is nonnegative but Æ 0, u 
is positive. 

We forgo a proof of part (iii), except to remind the reader that an analogous but 
weaker result—the compactness of K in the L?-norm—was proved in theorem 5 of 
chapter 22. 


By theorem 1, an operator K having the properties above has a positive eigenvalue 
A that dominates all others, with the corresponding eigenfunction strictly positive. 
. Since K is the inverse of L, its eigenvalues are the reciprocals of those of L. Therefore 
we conclude from theorem 3: 


Theorem 4. The second-order elliptic partial differential operator L defined in (25) 
has a positive eigenvalue that is smaller than the absolute value of all other eigen- 
values. The eigenfunction corresponding to this eigenvalue is positive. 


Theorems 3 and 4 hold for operators L of form (25) where the Laplacian A is 
replaced by any second order elliptic operator }° a;jð;ð j» (aij) a positive definite 
matrix. 

Theorem 1 for integral operators has been derived by Jentsch. Krein and Rutman 
have further extended theorem 1 to spaces where the set of positive functions are 
replaced by a convex cone in a Banach space. 

Estimates for the second largest eigenvalue have been derived by E. Hopf. 


BIBLIOGRAPHY 


~ Jentsch, Rv Uber Integraleléichiingen mit positivert Kern. J. Reine Angew. Math. T41 (1912): 235-244, 
Hopf, E. An inequality for positive operators. J. of Math. and Mech., 12 (1963): 683-692; 889-8927 —— 


Krein, M. G. and Rutman, M. A. Linear operators leaving invariant a‘cone in Banach space. Usp> Mat. 77 7 
Nauk 3 (1), 23 (1948): 3-95, AMS Transl., 26 (1950). 


Lawler, G. F. and Sokal, A. D. Bounds on the L? spectrum for Markov chains and Markov processes; a 
generalization of Cheeger’s inequality. Trans. AMS, 309, (1988): 557-580. 


Lax, P. D. Linear Algebra. Wiley, New York, 1997. 


4 os 


FREDHOLM’S THEORY OF 
INTEGRAL EQUATIONS. 


The historically first general theory dealing with the solution of linear equations in 
infinite-dimensional spaces is due to Ivar Fredholm in the year 1900. Its importance 
was immediately recognized, and it spurred a great deal of further work by Hilbert, 
Schmidt, F. Riesz, Banach, and many others. These newer theories, set in what was 
formalized as Hilbert and Banach space, have completely replaced Fredholm’s the- 
ory. Unlike the new, abstract theories, Fredholm dealt with integral operators, and his 
central notion was the determinant associated with such operators: Since this deter- 
‘minant appears in some modern theories (inverse scattering, completely integrable 
systems), it is time to resurrect it. i 


24.1 THE FREDHOLM DETERMINANT AND 
THE FREDHOLM RESOLVENT 


] 
nai [ K(x, yu(y) dy = f(z), (1) 


where f is a given continuous function on the interval [0,1] and u an unknown func- 
tion to be determined. The kernel K(x, y) is assumed here to be continuous; Fred- 
holm’s treatment allowed some singularities. 

The question of when equation (1) can be solved can be settled by theorem 5 of 
chapter 21. To put equation (1) in that context, we will regard both functions f and u 
as elements of the space of continuous functions C[0,1]. As shown in chapter 22, the 
integral operator on the left of (1) is a compact operator. According to theorem 5 of 
chapter 21, the dimension of the nullspace of the operator acting on u on the left side 
of (1) equals the codimension of its range; by theorem 8, the range is characterized as 
those f that are orthogonal to the nullspace of the transpose of the operator, namely 
with kernel K’(x, y) = K(y, x). This result was proved by Fredholm for operators 
of form (1) and is known to this day as the Fredholm alternative. 


260 


THE FREDHOLM DETERMINANT AND THE FREDHOLM RESOLVENT 261 


Fredholm’s approach was to replace the integral in (1) by a Riemann sum over 
n intervals of length h. This yields a system of n linear equations for the values 


uj of u at the n nodes j/n of the subdivision. Fredholm expressed the solution of 


these equations as ratios of determinants and took the continuum limits of these 
determinants as n —> oo. 
The discretized form of (1) is 


uth)  Kjuj= fs i=1,...,0, a’) 


where f; = f (ih), h = 1/n and K;ij = K (ih, jh). Denote by D(A) the determinant 
of the matrix acting on the vector u in (1’): 


D(h) = det (I +AKij). (2) 


Clearly, D(h) is a polynomial in A: 
n 
D(h) =} amh". (2') 
0 


The coefficients am can be determined conveniently as Taylor coefficients: 


l dN” 
Zi PARA ae a 
am mi (5) D(h)la=0 (2") 


To differentiate a determinant, we use the rule 


d d 
att (Cis. Cn) = a Ce ee (3) 


We use the same rule to find the mth derivative. The resulting formula is simple 
because each column C; is a linear function of A. A further simplification is possible 
because ath = 0,C;(0) = Ej, the jth unit vector. So using (2) in (2”), we get an 
expression involving determinants of principal minors of K;;: 


a Doh) LEAD Kiet aes det Bite A EES 4) 
j SE ee Ses) 
i i,j 


We now set A = |/n and let n tend to oo. The kth term in (4), a sum with respect to 


k parameters, tends to a k-fold integral. To write these in a compact form, Fredholm 
introduced the following convenient abbreviation: 


b A PORER 7 pea 
K = det K (x;, y;), lL<i,j<k. 5 
Gare i yy) d ©) 


The formal limit as n — oo of the finite sum (4), h = 1/n, is the infinite series 


es Np sous XE 
Da af fe) an de (6) 


262 FREDHOLM’S THEORY OF INTEGRAL EQUATIONS 


Definition. D is called the Fredholm determinant of the operator acting on the left 
in (1). 


Lemma 1. The series ( a converges. 


E Proof, To. prove convergence, Fredholm relied on an inequality due to Hadamard 


for ceeeminaie k to estimate ie terms of the series (6): 


[det (C1. DIS <TJIcil, (7) 


where ||C|| denotes the Euclidean length of the vector C. The geometric | interpreta- 


tion of (7) is that the volume of the oa dir lopiped whose vertices are sums of vectors :—- 


C; is < the volume of the rectangular parallelopiped whose sides c; have the same 
lengths as the C;: j 


WCF = ICi, 
where the Ci are orthogonal to each other. 
Exercise I. Furnish an analytic proof of inequality (7). 
The kernel K, being continuous, is bounded: 
K(x, y| <M for all x, y, 


so the length of each column vector of the k x k matrix (5) is < M Vk. Therefore, 
according to (7) 
|x me 
Vases Yk 


The kth term in series (6) is then less M*k*/2/k!, which, by Stirling’s formula, is 
< (Me)*k7*/, This estimate shows that the series (6) converges. o 


< MTRL. (8) 


Fredholm showed how to get a better estimate than (8) in the case where the kernel 
K satisfies a Hölder condition in the variable y; 


IK (œ, y) — K(x, z)| < Mly — zļ|ř, (9) 
In the matrix on the right in (5) we subtract the (i + 1)st column from the ith, i = 


1,...,&4 — 1. This new matrix has the same determinant as the original one, and its 
ith column has length < M|yi+1 — y;|*Wk. It follows from (7) that 


Hy stag XE - . a 
ea s#emt (Tbe) 


THE FREDHOLM DETERMINANT AND THE FREDHOLM RESOLVENT 263 


By the arithmetic-geometric mean inequality 


i\k 
[ [bis -vil < (z) 


k ea 

Yl» +-+: Yk 
The same inequality holds if K (x, y) satisfies a Hölder condition in the variable x. 

We return now to equations (1’). To solve such a system of equations, we multiply 
(1^) by the inverse of the matrix acting on the unknowns. The elements of the inverse 
matrix can be represented as determinants of the minors of co-order / (i.e., one less 
than the order of the matrix to be inverted) divided by the determinant of the matrix. 
We apply this procedure to the system (1’) and obtain a formula analogous to (2), 
which can be brought to a form analogous to (4). Fredholm, by passing to the limit 
h — 0, has determined the continuous analogue R of these determinants. Using the 
notation (5), we can write R in the form 


so we deduce that 


R= Ko. f KCB) dx, +... 


= oe Pa Gee 
=a]. a adie (10) 
ch ‘Hja 


Inequality (8) shows that the series on the right in (10) converges uniformly for 

all x and y. This shows that R(x, y) is a continuous function of x, y. 
We show now how to use the kernel R(x, y) and the determinant D to solve 
- equation (1). Expand the determinants in (10), defined by (5), according to their first 


rows: 

MEG seas 3 R Mice XE 

K ( SAS KY KenL foe eer eee Se See n 
Vike. sy XE ETE oh 
—~ K(x, x) K ee SEERIAS a 
y, X2, e.. 9 
PEART S SUN 

Integrate (11) over the unit cube in x1, ... , X% space. We claim that the integrals of 
the last k terms on the right are all equal; this can be seen by interchanging in the jth 
integral 


; x1, X2, Bs Xk 
—1/ | ... | K(x, x;) K , 
te) J J (x, xj) Oo ue a 


264 FREDHOLM’S THEORY OF INTEGRAL EQUATIONS 


10 
the names of the variables x] and x Xj, , and then performing one row permutation and 


cean per mutations: The result i is 


gsn = oF px. a ; Žli Xk 
aes K bey ss . dx, = K(x, y) Xp. ee. XE 
$ R a pas EA l; s Ak z 
Fhine \ dep i dap 
A Ee dx, — Ši nee Je K(x, x1). Hie: PAD se A LX] es ee 


Divide this by k! and sum. Recalling the definition (10) of R(x, y) and (6) of D, we 
can write the resulting relation as 


Seren ie on ara KDE i KO XIROT TETI, 
which we rewrite as 
R(x, y) + f K(x, 2)R(z, y) dz — DK (x, y) =0. (12) 


H, instead of using the first row, we expand the determinants in (10) according to the 
first column, we get the analogous identity 


Rae Í K (z, y)R(x, z) dz — DK (x, y) = 0. (12) 
We return now to the integral equation (1). Fredholm regarded the left side as 


an operator acting on the unknown u; let’s denote the integral operator with kernel 
K (x, y) by the symbol K: 


(Ku) (x) = f K(x, y)u(y) dy. (13) 


Similarly R is the integral operator whose kernel is R. Equation (1) can be abbrevi- 
ated as 


(I+K)u = f. (13’) 


Fredholm observed that operators of form (13’) form a semigroup, namely that the 
product of two such operators is of the same form: 


(I+ H)—+K)=I1+L, (14) 
where the kernel of L is 


Lax, y) = Ka) +H y) + | Hæ DKG y) dz (14°) 


Fheorem 2. Ler K be a continuous kernel, and suppose that D £0. Then the oper- 


ator V+ K is invertible, and its inverse is 1 — D7 IR, where D and R are defined hy 
(6) and (10). 


THE FREDHOLM DETERMINANT AND THE FREDHOLM RESOLVENT 265 


Proof. Relations (12) and (12’) mean in operator language that 


R+KR-—DK=0, 
R+RK— DK =0. (15) 


Since D is assumed Æ 0, we can rewrite these as 


(+K) -— D'R) =1, 
d— D'R) +K) =L (15’) 


The converse is also true. 


Theorem 3. Let K be a continuous kernel such that D = 0; then the operator 1+K 
- has a nontrivial null-space and so is not invertible. 


Proof. In R(x, y) fix the value of y, and denote the resulting function of x by r: 
RG y) =r(). 
Then equation (12) can be written as follows: 
r+Kr =0; 


this shows that r belongs to the nullspace of I+ K. The flaw in this argument is that 
R(x, y) could be zero for all x and y, so that r is the zero function. So we must argue 
differently. 


Let A denote a complex parameter. Replacing K by 1K in formula (6) gives a 
power series in A, which we denote as 


us a area S 
pw = oa ff fe (i) an. daw (16) 


Similarly we define the function R(x; yy XY by replacing K with XK in (10): “tn 


R - A pans akt K NX py eee Vk 1 dx 17 
(x,y; ae sae Vader ake aX|...aXz. (17) 


~emmar4.-D (:)-and=R (xy): are-entire-analytic- functions: ofthat is;-analytic - 
_ for all complex values of x. 


Exercise 2. Prove lemma 4. 


If we set -= l-in-(16)-and-(17);-we-get back D-and-R-defined-in-(6)-and-¢10)-So; =- 


if D =0, D(A) has a zero at A = 1. Since D(A) is analytic and not = 0, this zero is 
of finite order. 


266 FREDHOLM’S THEORY OF INTEGRAL EQUATIONS 


` Lemma 5. Suppose that D(A) has a zero of order mati = 1. Then there is a value 
of x such that R(x, x; À) has a zero of order < matà = 1. 


Proof. Set y = x in (17) and integrate with respect to x: 


PTAA ENR as “k+l ss 
f Rena [fx e dx dx; ... dxp, 


The right side equals the derivative of (16) with respect to A and then multiplied 
by A: 


f ræna XED. EE e 


By definition of m, the right side has zero of order m — ] at à = 1; therefore so does 
the left side. But then R(x, x; à) cannot have a zero of order > m for every x. O 


Denote by £ the largest number such that R(x, y; A) has a zero of order £ at à = 1 
for every x and y. According to lemma 5, £ < m. We write 


R@, y: A) = a, yA- Ooa = H, (18) 
By definition of £, g(x, y) # 0. We take now relation (1 2) fora 41: 


RE WA + [AKERE y: A)dz = AK (x, y) DOA). 


Divide both sides by (A — 1)£ and let à tend to 1. Since D(A) has a zero of order 
m > £, the right side tends to zero. According to (18), R(x, y; A)(A— 1)* tends to g, 
so we obtain 


eG 9+ | KEDE ak =0. l (19) 


Since g # 0, there is a value yo such that g(x, yo) Æ 0 for some x. Clearly, u(x) = 
8(x, yo) belongs to the nullspace of I + K, as asserted in theorem 3. o 


Exercise 3. Suppose that the kernel K(x, y) is degenerate; that is, K(x, y) = 
I ki(x)hi (y). Show that then D(A) is a polynomial of degree < n. 


We connect now the zeros of the analytic function D(A) to the eigenvalues of K. 


Theorem 6. The complex number « is an eigenvalue of the integral operator K iff 
A = —1/k is a zero of D(A). 


Proof. Theorems 2 and 3 say that x = —1 belongs to the spectrum of the integral 
operator K iff D(1) = 0. Replacing K by XK yields theorem 6. o 


THE FREDHOLM DETERMINANT AND THE FREDHOLM RESOLVENT 267 


Theorem 6 shows that D(A) plays the role of the characteristic polynomial for 
the operator K. The algebraic multiplicity of an eigenvalue « of an operator K is 
the dimension of the space of its generalized eigenvectors, that is, the union of the 
nullspaces of («I—K)‘, i any positive integer. According to theorem 6 of chapter 21, 
the geometric multiplicity of a nonzero eigenvalue of a compact operator is finite. 


Theorem 7. Let « be a nonzero eigenvalue of the operator K defined in (13). Ac- 
cording to theorem 6, à = —1/k is a zero of the function D(A) defined in (16). 
Denote by m the multiplicity of this zero. Claim: m is the algebraic multiplicity of 
the eigenvalue x. i 


Proof. The proof is the same as that in the finite-dimensional case, emplying mi- 
nors of cofinite order of the determinant D. We omit the details. o 


The next result is a generalization to operators of two important matrix relations. 


Theorem 8. Denote by «<1, K2, ... the eigenvalues of an integral operator K whose 


kernel K(x, y) is Hölder continuous in x or y with Hilder exponent > 4. Then 


fraw dx = Yk, (20) 
D=] [0 +w). (20) 
The series and the product converge absolutely. 


The quantity on the left in (20) is called the trace of the integral operator K;(20) 
is called the trace formula. In chapter 30 we will establish such a trace formula fora 
large, invariantly defined class of operators in Hilbert space. 

The proof of theorem 8 relies on Hadamard’y factorization theorem. We state it 
for the case that is needed: 


Theorem 9, Let f(A) be an entire analytic function of À, defined for all complex i 
of order < 1, that is, it satisfies the growth condition 


IFO) < const. (exp|Ail?), p< 1. (21) 


Denote the roots of f, counted with multiplicity, by {Aj}, and assume that À = 0 is 


_Not a root, that is, that f (0) # 0. Then X` |r jl”! converges, and f can be factored 
as 


ra =fOT] (1 a >) E (22) 


This result isnot deep; for a proof, we refer the reader to Ahlfors’s text on complex 


variables. We will apply (22) to f(A) = D(A): 


268 FREDHOLM’S THEORY OF INTEGRAL EQUATIONS 


Lemma 10. If the kernel K satisfies a Holder condition with exponent a, with re- 
——— spect to_either.the.x or the y variable, then D(A) satisfies the growth condition 


= |D(A)| < const. (exp [al2/C+20)) 


Exercise 4.. Prove lemma‘10,-using.the definition.(16).of D(A) and inequality (8). 


i Clearly, fora > 4 the function D(A) i s of order less than 1, and so can be factored 


as in (22). Using the connection established in theorem 6 between the zeros of D and ~~ 


the eigenvalues x; of K, we can write this factorization as 


Here we have made use of the fact, evident from (16), that D(0) = 1. We differen- 
tiate (22') and set A = 0. On the left side in (22") we obtain dD(A)/dA|,=0, which 
according to the power series expansion (16) is equal to f K(x, x) dx. The infinite 
product on the right is for |A] < 1 the uniform limit of finite products; therefore 
its derivative is the limit of the derivative of the finite products. This shows that the 
derivative of the right side of (22') at à = 0 equals J- «j. This proves (20). Setting 
à = 1 in (22) is identity (20’). This completes the proof of theorem 8. O 


eine 24.2 THE MULTIPLICATIVE PROPERTY OF 


THE FREDHOLM DETERMINANT 


Next we present Fredholm’s extension of the multiplicative property of determinants 
to operators. We will denote the determinant of I + K by Dx, of I + H by Dy, and 
so on. Similarly we will denote the inverse of I+ K by I — D;'Rx, and the kernel 
of Rg as Rg (x, y). 


DQ) =] Jed) peste DE es 


Theorem 11. Let H and K be integral operators with continuous kernels, and set 


+H) + K) =1I+4+L. Then 
D, = DyDx. (23) 


Proof. We present Fredholm’s beautiful argument based on computing the varia- 
tion of Dg. We define 


d 
bDK = Je P K+K le=0 (24) 


To compute 6Dx as function of ôK, we first calculate the variation of the deter- 
minant (5). Using formula (3) for differentiating a determinant, we get a sum of k 
determinants: 


aK (Sst as aek (25) 
bases - 


THE MULTIPLICATIVE PROPERTY OF THE FREDHOLM DETERMINANT 269 


where the /th column of Kg is ôK (xi, ye). Expand det Kp with respect to the /th 
column: 


5K P a = => -K a ei Oin hias oF) aK Om xo) (25) 


Ky vey Kh Mgr) EOE KE 


where the parentheses indicate that the /th column and the mth row are to be omitted. 
Integrate (25’) over the k-dimensional unit cube. All k terms with £ = m are equal; 
denoting xg = Xm = x and relabeling the remaining variables x,,..., Xk—1, We get 
the following expression for the integral of the sum of all terms in (25’) with £ = m: 


ee Ces dx... dy | 8K(x,2) dx, (26) 
XY yee Skimt 

The remaining k(k — 1) integrals with xg A x are also equal; relabel Xm = X; 

xe = y, and the remaining variables as xj,...,x,—2. Suppose that £ < m; then 


€ — | row transpositions and m — 2 column transpositions bring all terms in (25’) 
with £ Æ m into the same form so that the sum of all these terms is 


—k(k — oaa e a ... dxp2dxdy. (26') 


Xk 


To obtain ôD g, we form termwise the variation of the series (6) defining Dg. We 
get the sum of (26) and (26’) divided by k! Using the definition (6) of D K, We write 
the sum coming from (26) as 


Dx J ôK (x, x)dx, 
and the sum coming from (26’) in terms of Rx (x, y) defined by (10), as 


-Jf Rx (y, x)ôK (x, y)dxdy. 


Altogether we get 


ôDK = Dr f aK G.3) dx — Tf Rg (y, x)ôK (x, y)dxdy. 


Assume that Dg # 0 and divide by Dg; we get 


Slog Dg = | 8K (x, x) dx — Dz! JI Rg (y; x)8K (x, y)dxdy. - (27) 
According to formula (15°) the operator I — D'R x is the inverse of the operator 
I+K. Therefore we can regard the x integration on the right in (27) as an application 


of (I+ K)7! tos K¢, y), and rewrite (27) in operator notation as 


Slog Dr = faroe, y)dy. (28) 


war 


270 FREDHOLM’S THEORY OF INTEGRAL EQUATIONS 


. Alternatively we can regard the y integration on the right in (27) as the application 
———------Oof_the.transpose.of I. =. Dz'RK to the. function ôK (x, -). Since the transpose of the 
inverse is the inverse of the transpose, we can also rewrite (27) as follows: 


© Boge = [Ce K KG. Jdr 8 


~~where K’ denotes the transpose of the operator K. = 
Let K and H be-any two continuous kernels, Dg #4 0,Dy # 0. Denote, as. 
before, the product of the operators I + H and I + Kas I + L. By theorem 2, I + K 
and I + H are invertible; therefore so is their product I + L. Then by theorem 3, 
To compute the variation of D; in terms,of 6H and 6K, we express the kernel of 
Lin terms of H and K. Since L = K + H + HK, 


L(x, y) = K (x;y) + H (x, y) + [Ra oKG y)dż, 
and so 


E AES AOT / SH(x,2)K (2, y)de+ / H(x, 2)8K (z, y) dz. 
(29) 


We can express the integrations on the right in (29) as the actions of the operators K’ 
and H: 


bL(x, y) = T+K)5H (x, -)+ 0+ HbK(-, y) (29) 


~ We set this expression for ôL into formula (27) applied to the kernel L. We express 
the terms involving 5H as in (28’) and those involving ôK as in (28); we obtain 


Slog Dy, = f +L)! A+K')8H (x, -)dx+ f +L)! (A+8E)8K(-, y)dy. (30) 


We recall that I + L is defined as (I + H)(I + K); taking the inverse and then the 
transpose, we get 


+b) =0+K)!'04+m)7!, +L) = 0+8) A+ KA !. 
Setting these into (30) gives 


log Dy = farming, -dx + [046K y)dy. (30) 


Comparing this with (28) and (28), we get 


Slog D; = log Dy + dlog Dx. (31) 


THE GELFAND-LEVITAN-MARCHENKO EQUATION AND DYSON’S FORMULA 271 


Now deform K and H into zero so that Dg #0, Dy Æ 0 during this deformation. 
This is easily done by setting K(t) = A(t)K, H(t) = A(t)H, where the complex- 
valued function A(t) avoids zeros of both Dx (A) and Dy (A). Formula (31) shows 
that 


d 

z los Duty — log Dk Dael = 9. 

Since L(0) = K(0) = H (0) = 0 and Do =1, we deduce that log Dz —log Dg Dy = 
0; this proves the multiplicative property of the determinant when Dx 40, Dy #0. 
When Dy = 0,1+ H is not onto, and when Dg = 0,1 + K is not one-to-one. In 
either case (I + H)(I + K) is not invertible, so Dg = 0. Thus D; = Dy Dx in all 
cases. o 


Exercise 5. Justify the calculation of 5Dx termwise in the infinite series defining | 
Dx. 


24.3 THE GELFAND-LEVITAN-MARCHENKO EQUATION 
AND DYSON’S FORMULA 


So far we have taken the integration in the integral equation (1) over the unit interval. 
It could of course be any interval [a, b], even an infinite interval [a, co], provided that 
the kernel K tends to zero fast enough as its arguments tend to co. 


Exercise 6. Show that if 
IK (x, y)| < M(x)M (y), 


where M (x) is a decreasing function and integrable, 
l co 
I M(x)}dx < oo, 
a 


then the series (6) and (10) defining the Fredholm determinants converge... -. 0 me = 


Let K(x, y) be a continuous kernel defined for all real x and y; suppose that as 
x, y tends to oo, the kernel tends to zero with sufficient rapidity. For any real a we 


define. ee eee ever-the——_—_— 
interval (a, oo) with kernel K:. cae se ees BO a e nh tio 


eer fe Be POO! ws. scent earn eis: a 3 
= f K(x. yuly)dy, a <x. (32) 


Denote by D(a) the Fredholm determinant of I + Kg. We will study how D(a) 

depends on a; we assume that D(a) Æ 0. ae aor nas See 
Let A be any real number; the simple transformation y — y + h carries the 

operator Ky, into the integral operator with kernel K(x + h, y + h) over the fixed 


272 l ' ` FREDHOLM’S THEORY OF INTEGRAL EQUATIONS 


“interval (a, 00). The derivative of this kernel with respect to A ath = 0 is 6K.= 

. —-KyKyTherefore the derivative of D(a) with respect to a can be.computed from 

formulas (27), (28) for the variation of D. To carry this out, we need-a few identities 

connecting the kernels K and R. We derive these from formulas (12) and (12'), which 
express the fact that I + K and I — D~!R are inverses of each other: 


ee ae eer 7 oof =K (xz) R(zy J dz—-DK: (x 3) ae AEE E EON 
Rox y+ | Ke DRE ddz- DKO, y) =0. (12) 
Differentiate (12) with respect to x: aS 


Re f Kele, DRG, y) dz — DK x(x, y) = 0. 
Setting y = x in this relation gives 

Relax) + f Ka DRG) dz- DKr (x, x) =0. (33) 
Similarly 

Ry, y) + i Ky(z, y) RO, z)dz — DKy(y, y) =0. (33’) 
We take now formula (27) for the variation of log D and set ôK = Ky + Ky: 


é log D(a) = fiee» + Ky(x,x)]dx — D7! I R(y, x)[Kx(x, y) 
+ Ky(x, y)] dx dy. (34) 


~ We use relations (33) and (33’) to replace the two double integrals on the right in 
(34) by single integrals: 


blog D(a) = fike» + Ky(x,x)]dx + fiore — K(x, x)]dx 
+ fioro, y) — Ky (y. y)]dy 
œ 
= Dy, [Rx(x, x) + Ry (x, x)] dx 
= —D~!R(a,a). 


Since the variation of log D(a) is its derivative with respect to a, 


d 
— log D(a) = ro (a)R(a,a), which gives Í na) =—R(a,a). (35) 
da da 


THE GELFAND-LEVITAN-MARCHENKO EQUATION AND DYSON’S FORMULA 273 


Exercise 7. Derive formula (35) by differentiating formula (6) with respect to the 
lower limit of integration. 


We return now to formula (12) and note that it has the following interpretation. 
Let y be any fixed number, a < y; then the solution w’ of the integral equation 


w + Kaw’ = K(-, y) 
is w(x) = D7! (a)R(x, y). Setting, in particular, y = a,x = a, we get w'(a) = 


D! (a)R(a, a). Using the relation (35), we have therefore 


Theorem 12. Denote by w = w(x; a) the solution of the integral equation 
Cc 
Waf Kæ, wO) dy = — K(x, a). (36) 
a 
Then 
d 
w(a; a) = — log D(a). (37) 
da 


We show now how to apply this result to the so-called inverse problem of scatter- 
ing theory concerning the differential operator 


—a +q, (38) 


where q = q(x) is a potential that rapidly tends to zero as x —> -too. The inverse 
problem is to determine the potential q from the so-called reflection coefficient, to be 
defined in chapter 37. This problem arises in physics, in situations where the potential 
q is inaccessible to direct measurement whereas the reflection coefficient is. 

Gelfand, Levitan, and Marchenko have devised an integral equation from whose 
solution g can be determined. Denote by Z(x) the Fourier transform of the reflection 
coefficient; in the case where the operator (37) has point eigenvalues, these have to 
be included in Z. The G-L-M equation for an auxiliary function w is 


co x 
eee l wea | Z(x + y)w(y; a) dy = —Z(x + a). (39) 


According to G-L-M (e.g., see Lax, pp. 84—91) the potential q is related to the solu- 
tion w of the G-L-M equation by 


‘eats F B AN A EL Fa ae 

q(a) = —2—w(a, a). (40) 
da 

Equation (39) is a special case of equation (36), with K(x, y) = Z(x + y). We can 

therefore appeal to theorem 12 to deduce from (40) the following formula due to 

Freeman Dyson: 


d? 
qla) = -27g 08 D(a), (41) 


274 ` FREDHOLM’S THEORY OF INTEGRAL EQUATIONS 


where D(a) is the Fredholm determinant of the G-L-M operator acting on w on the 
left in (39). 


BIBLIOGRAPHY 


Ahlfors, L. V. Complex Analysis. McGraw-Hill, New York, 1979. Tbe ee eet ina estos 


..Dyson,.F..J,.Fredholm determinants and inverse_scattering problems. Comm. Math.. Phys. 41 (1976): —..- 
171-183. 


Fredholm, I. Sur une classe d'équations fonctionnelles. Acta Math., 27 (1903): 365-390. 


Gelfand, J. M. and Levitan, B. M. On the determination of a differential equation by its spectral function. 


~ Jav. Akad. Navk SSSR Ser. Mat., 15 (1951):-309-360;AMS ‘Transl, 1 (1955):-254-304, 1 2A AAN 


Lax, P. D. Outline of a Theory of the KdV Equation. Recent Mathematical Methods in Nonlinear Wave 
Propagation, Lecture Notes in Mathematics, 1640. Springer Verlag, 1996, pp. 70-102. 


Marchenko, V. A. Concerning the theory of differential operators of second order. Dokl. Akad. Nauk SSSR, 
72 (1950): 457-460. 


a, 


INVARIANT SUBSPACES 


Let X denote a Banach space, M a bounded, linear map: X —> X. We recall from 
linear algebra the notion of an invariant subspace: it is a subspace Y of X that is 
mapped into itself under M. Y is called nontrivial if it is not the whole space and 
consists of more than the zero vector. In the context of normed linear spaces we are 
interested in closed invariant subspaces Y under M. 

Given a closed invariant subspace Y of M, there is the possibility of inverting M 
in two stages. The task is, given u in X, to find an xg in X such that Mxọ = u. First 
we solve this equation mod Y, that is, find an x; that satisfies Mx; = u( mod Y); 
then z = u — Mx, belongs to Y. Put x9 = x; + y, and choose y in Y so that it 
satisfies My = z. The first step amounts to inverting M on the quotient space X/Y; 
this task can be handled adequately by analytical tools only if X/Y has a normed 
structure, and that is possible only if Y is a closed subspace of X. 

It follows from the algorithm presented above that if M is invertible on X/Y and 
on Y, then it is invertible on X. The converse is false, as shown in chapter 2. 


Exercise 1. Prove that if M : X — X is invertible, and Y a finite-dimensional 
_ subspace of X invariant under M, then M is invertible on Y and X/Y. 


Exercise 2. Show that the subspaces of X invariant under an operator M form a 
lattice in the sense that the intersection of two invariant subspaces is an invariant 
supspace;-and-the-closure-of-the-sum-of-two-invariant subspace-is an invariant sub- 
space. l 


25. -INVARIANT SUBSPACES OF COMPACT MAPS 


Of course every subspace spanned by eigenvectors of M is invariant under M, but 
there are plenty of operators without any eigenvalues: As we will show below, 
many of these possess nontrivial invariant subspaces. The following result is due to 
von Neumann for Hilbert space, and to Aronszajn and Smith for Banach spaces. The 


—---—proof given-below-is-due-to-Lomonosov-; with-a-simplification due to Hilden. 


276 : INVARIANT SUBSPACES 


Theorem 1. A compact map C of a complex Banach space X of dimension greater 
than J into itself has a nontrivial closed invariant subspace. 


Proof. Suppose that C 4 0; normalize C so that 
SM, ules . i |C| = 1. (1) 


Ere Æ 0, we can choose xo in X so that 


‘|Cxo]>1, {xol > 1. (2) 


Denote the closed unit ball around x9 by B: 


B = {x||x — x0] < 1}. (3) 


It follows from (2) that the origin does not belong to B. Denote by K the closure of 
the image of B under C: 


K =CB. (4) 


Since C is a compact map, K is a compact set. It follows from (2) and (1) that K 
does not contain the origin. 

We will prove the existence of an invariant subspace by arguing indirectly: as- 
sume that C has no nontrivial closed invariant subspace. Now for any y Æ 0 the set 
{p(C)y}, where p is any polynomial, is a linear subspace of X invariant under C. 
The closure of this subspace is a closed invariant subspace of C, and since C is not 
supposed to have any nontrivial ones, this one must be trivial, meaning all of X. So it 
follows from our supposition that for any y # 0, the set {p(C)y}, p any polynomial, 
is dense in X. p , 

Since we have seen before that K does not contain 0, we can apply the above to 
any y of K. In particular, for any y in K there is a polynomial p such that 


Ip(C)y — xol < 1. (5) 


Since (5) is a strict inequality, the set of y that satisfies (5) with given p is an open set 
Op. Such open sets cover K; since K is compact, a finite collection suffices: Thus 


there is a finite collection of polynomials pı, ..., py such that for every y in K 

inequality (5) holds with some p = p;. We introduce the abbreviation p; (C) = C}, 

i =1,..., N; the state of affairs can be expressed as follows: for every y in K, 
[Ciy — xol < 1 (6) 


for at least one i. 

The point xq belongs to B; according to definition (4) of K, Cxo € K. Therefore 
we may in (6) set y = Cxg and conclude that for some i = i4, IC; Cxo — x9] < 1. 
By definition (3) of B, this means that the point C;,Cxo belongs to B, and so, by 
definition (4) of K, CC;,Cxo € K. Therefore we may in (6) set y = CC;, Cx and 


NESTED INVARIANT SUBSPACES : R 277 


conclude that for some i = i2, 
IC CC; Cxo — xol < L. 


Repeating this argument, we conclude recursively that 


d: 


n 
EK C)xo — xo 
1 


We obtain from this by the triangle inequality that 


[ [Cr Oxo] = Ixol — 1; (7) 
1 


note that according to (2), the right side is positive. 
Since the C; are polynomials in C, they commute with each other and C, so (7) 


can be rewritten as 
n 
[ [ci } C”xo 
| 


Denote the largest of the norms of C; by c: |C;| < c,i = 1,..., N. Then it follows 
from (7’) that 


= [xo] — 1. (7) 


e"|C"| lxo] > xol — 1. 
Taking the nth root and letting n —> oo, we conclude that 


lim jcn!” > 1 
n—> o0 c 

According to spectral theory (see theorem 4, chapter 17) the quantity on the left 
is the spectral radius of C. Since |o (C)| > 0, the spectrum of C contains points 
other than 0. But according to the spectral theory of compact operators (theorem 6 of 
chapter 21) such points are eigenvalues of C of finite multiplicity- The corresponding 
eigenspace is invariant under C, in contradiction to our supposition. This proves 
theorem 1. g 


25.2 NESTED INVARIANT. SUBSPA CES === 22ers e a n 


Once we have proved the existence of a single invariant subspace Y, we can, by 
invoking that theorem repeatedly, prove the existence of infinitely many invariant 
subspaces. Thus Y itself has an invariant subspace, and so does X/Y, which gives 


rise to an invariant subspace Z, Y C ZC X. e ai 
In an interesting paper, Ringrose has explored such familjes of invariant ace 
spaces. 


278 INVARIANT SUBSPACES 


Definition. Let X be a complex Banach space. A collection of closed subspaces ` 
{M} of X is called a nest if the subspaces M are nested, meaning totally ordered by 
inclusion; a nest will be denoted by the symbol M. 


Reece ean: Let X be a complex Banach space, M a nest of closed subspaces of X, 
~-C-a compact mapof X intovitself. If each subspace i in the nest M is s invariant under 
Se sue, then. A is.called an invariant nest for C. 


Theorem 2. Let C bea compact map ofa complex Banach space X into itself. 


(i) There exists a maximal invariant nest N for C: 
GY N contains the trivial subspaces {O} and v A 
(iii) Let No be a subset of N. The subspace N = ML: L in No} belongs to N. 


Proof. The existence of a maximal invariant nest follows from Zorn’s lemma. The 
space N described in (iii) is obviously closed and invariant under C. Take any K in 
N; if some L in No is contained in K, so is N. Otherwise, every L in No contains 
K, but then so does N. Therefore, since M is maximal, N must belong to N. oO 


Theorem 2’ (Ringrose). Let C be a compact map of a complex Banach space X into 
itself, N a maximal invariant nest of C. Denote by M any of the closed subspaces 
_contained in the nest N, and denote by M_ the subspace 


M- = U{L in N: L properly contained in M}. (8) 
By maximality, M— belongs to the nest N. | 


(i) The quotient space M/M- has dimension 0 or 1. 
(ii) Suppose that dim M/M_ = 1. Then C maps M/M- into itself as multiplica- 
tion by a scalar; denote this scalar by u. If u # 0, u is an eigenvalue of C. 
“(iii) Conversely, every nonzero eigenvalue y of C occurs as p for some M in N. 
The number of times y occurs as p in N equals the algebraic multiplicity of 
y as eigenvalue of C, that is, 


max dim nullspace of (y1 — Cc). 
l 


Proof. (i) Since both M and M_ are invariant under C, C maps M/M- into itself 
and is a compact map. If dim M/M were > 1, by the Aronszajn-Smith theorem C 
would have a nontrivial invariant subspace in M/M—. This would correspond to an 
invariant subspace L of C in X, where L properly contains M_ and is properly con- 
tained in M. Since M is maximal, L belongs to A’; but this contradicts the definition 
(8) of M_. 

(ii) uI — C is the zero mapping of M/M_L into itself. This means that yI — C 
maps M into M. Since C restricted to M is compact, and u # 0, it follows from 
theorem 5 of chapter 21, that the nullspace of (uI—(C) in M has the same dimension 


NESTED INVARIANT SUBSPACES 279 


as the codimension of (uI — C)M in M. We have seen that (I — C) M is contained 
in M_, and therefore has codimension at least 1. This proves that the nullspace of 
(uI — C) contains nonzero vectors, namely that u is an eigenvalue of C. 

(iii) Suppose that y is an eigenvalue of C, and x a corresponding eigenvector. 
Define A, to be the collection of all subspaces in M which contain x. Define M = 

A{K: K in Ay}; by theorem 2, M belongs to Ax. Define M_ by (8); suppose that 
M is properly contained in W. Then M_ does not contain x, since by definition M 
is the smallest space in M that contains x. So M/M- is represented by x; the action 
of C on x is to multiply it by y. It follows that 4 = y, as asserted in part (iii). 

To complete this proof, we have to show that M_ is properly contained in M. Ac- 
cording to the Fredholm alternative, theorem 8 of chapter 21, a vector y in X belongs 
to the range of C — y1 iff (y, m) = 0 for every m in the nullspace of the transpose 
C’ — yI. Take for simplicity the case that y is an eigenvalue of multiplicity 1. We 
claim that the corresponding eigenvector x satisfies (x, m) Æ 0. If it did not, x would 
be in the range of C — y I, namely (C — y I)u = x for some u in x. Applying (C— y1) 
to this relation gives 


(C— yl2u =0, 


which would make u a generalized eigenvector, and y an eigenvector of multiplicity 
at least 2, contrary to our assumption. 

Let M be as above, the smallest subspace in the nest M that contains x. Let L be 
subspace in the nest M that is properly contained in M; such an L does not contain x. 
Since y has multiplicity 1, the nullspace of C — yI over X is spanned by the single 
eigenvector x. It follows that the nullspace of C — yI over L contains only the zero 
vector. Therefore C — y1 is a one-to-one map of L into L; according to the Fredholm 
alternative, C — yI maps L into itself. It follows that the every vector y in L belongs 
to the range of C — y I and therefore satisfies the compatibility condition (y, m) = 0 

We are ready to show that x does not belong to M_ as defined by (8). If it did, 
it would be the limit of a sequence of vectors yp in some Ly, properly contained 
in M. As we have shown above, every such yn satisfies the compatibility condition 
(Yn, im) = 0, but then, passing to the limit, we would deduce that (x, m) = 0. This 
contradicts our previous finding that (x, m) Æ 0. 


Exercise 3. Modify the argument above to include eigenvalues of arbitrary algebraic 
multiplicity. 


===- ~-~-€Constructing-a nest of invariant ee 
space analogy of bringing a finite matrix into upper triangular form.-An-account-ofe=== 
the theory of such triangular forms can be found in K, R. Davidson’s monograph on... 
nest algebras. 
We give now an example: X = C[0, 1], V is integration, 


t 
VNE) = | f(s) ds. (9) 


280 l INVARIANT SUBSPACES 


We saw in chapter 22, equation (15), that this map is compact but has no éigenvec- 

tors; therefore its spectrum consists of the point 0. On the other-hand,the-closed_—___ 
subspaces C4 of C consisting of those f that vanish on [0, a] are obviously invariant 
for V. 


Exercise 4. Prove that the subspaces Ca, O < a < 1 form a maximal nest. 


- --- -Brodsky and Donoghue-have;-independently,-proved-in-the. L? topology the con- 
verse of the propostion above: 


Theorem 3. Define H to be the Hilbert space L?[0, 1], and define Ha to consist 


of those functions in H that are zero on the-subinterval [0,a], 0-2-4-2 1.-The-———— 
integration operator V defined by (9) maps H into itself, and every Ha into itself. 
Conversely, V has no other invariant subspaces. 


Proof. (Donoghue) We start with a brief discussion of the convolution of L! func- 
tions on the real line. The convolution f * g of two L! functions on the real line is 
defined by 


(f * 8)() = f feet -as. (10) 
on ae 0 
We have shown in section 6 of chapter 19 that the convolution of two L! functions 
is L', and that convolution is associative and commutative. 
Suppose that both f(s) and g(s) are zero for s large enough negative, and denote 
by £ f, respectively £g the lower end of the support of f and g: 


£s = sup{é: f(s) = 0 for s <4). ; (11) 


It follows from this definition that the integrand on the right in the definition (10) 
of convolution is zero when t < £f + £g, for either one or the other factor is zero. 
This shows that (f * g)(t) = 0 when t < £f + £g, from which it follows that 
Lfxg = £f + £g. According to the Titchmarsh convolution theorem the sign of 
equality holds: 


lfag = Le + Lg. (12) 


A proof of this remarkable theorem will be given in chapter 38. 
The mapping V defined by (9) it can be expressed as a convolution: 


Vf =h*« f, (13) 


NESTED INVARIANT SUBSPACES 281 


where f is set to zero outside the interval [0,1], and the function / is the Heaviside 
function 


0 fors <0 


h(s) = 1 forO<s. 


(14) 


Note that the right side of (13) is defined on all of R; its restriction to [0,1] equals 
Vf. It follows from the associative property for convolution that for any natural 
number n, 


Vif sah” x f, (13’) 


where A®™ is the n-fold convolution of h with itself. An easy calculation, already 
performed in chapter 20, shows that 


0 fors <0 


1 
s/n! for0 <s. (E 


h™ (s) = | 


The nontrivial part of theorem 3 is that the only invariant subspaces of V consist 
of all L? functions that vanish on an interval [0, a]. This follows from 


Lemma 4. Denote by f any function in L?[0, 1]. The set of functions f. Vf, Ve f; 
... Span LE, 1), where £ = £p. 


Proof. Suppose that g in L?[0, 1] is orthogonal to V* f, n = 0, 1,.... We can, 
using (13’), write this condition as 


(h * f,g) =0, n=0,1,... (15) 


where (,) denotes the scalar product in L7(0, 1]. Now define the function g_ in 
L?(—1, 0) by 


g—(s) = g(—s). = (6 
For any function k in L?{[0, 1], we can write the L? scalar product as 
(k, 8) = (k * g-)(0). (17) 


Using associativity of convolution and (17), we can write condition (15) as 


(A + f + g_)(0) =0; 
using (17) once more, we find that this is equivalent to - 
(a, f xg_) =0. (15°) 


Note that f * g_ is supported in [—1, 1], and a” (s) = 5"/n! on the interval [—1, 0] 
and zero for s > 0. Since by the Weierstrass approximation theorem polynomials are 


282 INVARIANT SUBSPACES 


dense in L'I], 0], it follows from (15’) that f xg = 0 on [—1, 0]. We can express 
this by the inequality £ fxg_ > 0. According to the Titchmarsh convolution theorem, 
it follows that 


Ef + Lle > 0, 


-which implies that g- Gy 0 for s < —£ f. By definition (16) of g_ this is the same 


as 


g(s) =0 for s>lLy. 


Since g is any function orthogonal to all V” te this shows that the orthogonal comple- 


ment of the > span of {V" f} is contained in L(0, , £]. It follows that the : span of {V" f} 
contains L? [£, 1]. On the other hand, since each function V” f is zero on (0, £], it 
follows that the set {V” f} spans 1210, 1]. 0O 


Theorem 3 follows easily from lemma 4. To see this, let Y be a closed invariant 
subspace of V in L?{[0, 1]. Given any finY, Vf, V2 f. and so on, also belongs to Y, 
and so by lemma 4, Y contains L7[£ f» 1]. It follows that Y contains L?[a, 1], where 


= inf £ 
a pe fe 
On the otier hand, by definition of a, f(t) = 0 for t < a for any f in Y. This shows 
that Y = L?[a, 1]. im 


The Aronszajn-Smith theorem has been extended in several directions. Robinson 
and Bernstein have shown, using Robinson’s nonstandard analysis, that an operator 
T has an invariant subspace, provided that there is a polynomial p such that p(T) 
is compact. Lomonosov has shown that every operator that commutes with a com- 
pact operator has an invariant subspace. Whether every noncompact operator has an 
invariant subspace has .been-an open problem, until Enflo exhibited a Banach space 
X, especially constructed for this purpose, and a linear mapping of X into X that 
is irreducible, meaning that it has no nontrivial invariant subspace. Subsequently ir- 
reducible linear maps were exhibited for more familiar spaces but none in reflexive 
spaces. It is in particular an open question if there are irreducible operators in Hilbert 
space, and it is an open question whether this question is interesting. 

In chapter 38 we will present Beurling’s description of all invariant subspace of 
the right shift operator R acting on £7. 


BIBLIOGRAPHY 


Aronszajn, N. and Smith, K. T. Invariant subspaces of completely continuous operators. An. Math., 60 
(1954): 345-350. 


Bernstein, A. R. and Robinson, A. Solution of an invariant subspace problem of K. T. Smith and P. R. 
Halmos. Pacific J. Math., 16 (1966): 421-431. 


BIBLIOGRAPHY 283 


Brodskii, M. S. On a problem of I. M. Gelfand. Uspekhi Mat. Nauk (N.S.), 12 (1957): 129-132, 


Davidson, K. R. Nest Algebras. Pitman Research Notes in Math, 191. Longman Scientific and Technical, 
Essex, England, 1988. 


Donoghue, W. F. The lattice of invariant subspaces of a completely continuous quasinilpotent transforma- 
tions. Pacific J. Math., 7 (1957): 1031-1935. 


Enflo, P. On the invariant subspace problem for Banach spaces. Acta Math., 158 (1987): 213-313. 


Lomonosoy, V. I. Invariant subspaces for the family of operators which commute with a completely con- 
tinuous operator. Funct. Anal. Appl., 7 (1973); 213-214. 


Radjavi, H. and Rosenthal, P. /nvariant Subspaces. Springer, New York, 1973. 
Ringrose, J. R. Compact Non-self-adjoint Operators. Van Nostrand Reinhold, New York, 1971. 


Ringrose, J. R. Superdiagonal forms for compact linear operators. Proc. London Math. Soc. (3), 12 (1962): 
367-384. 


oo 


HARMONIC. ANALYSIS | 


“SONA HALPEINE, = eas 


In this chapter we refine the technique described in chapter 21 for the study of com- 
pact operators, and use it to deduce the exponential decay of a class of functions. 


26.1 THE PHRAGMEN-LINDELOF PRINCIPLE 
FOR HARMONIC FUNCTIONS 


The maximum principle of the theory of analytic functions has a classical general- 
ization due to Phragmén and Lindelöf for analytic functions defined on domains that 
stretch to infinity. The assumptions are that the function is bounded on the boundary 
of the domain, and that it is of limited growth as z approaches infinity. The con- 
clusion is that the function is bounded in the whole domain. This principle has an 
analogue for harmonic functions which we now state and prove: 

Let h(x, y) be a harmonic function defined in the half-strip -1 < x <1,0<y, 
and continuous up to the boundary. We assume that 


(i) h(+1, y) = Ofor y > 0. 


(ii) |h(x, y)| < const. e! in the half-strip, where l is some positive number less 
than x /2. 


We claim that then |h(x, y)| < const. e7’"k in the half-strip, where m is any number 
less than x /2. 


Proof. We want to compare A to the auxiliary harmonic function 
ke(x, y) = Acosmxe”” + e cosmxe™, (1) 
where m is some positive number, 
L<m< = (1’) 


284 


AN ABSTRACT PRAGMEN-LINDELOF PRINCIPLE iS 285 


Because of the upper bound placed on m, the function cos mx is positive on —1 < 
x < 1; choose A in (1) so large that for all x on [—1, 1], 


|A(x, 0)| < Acos mx. (2) 


We claim that |h(x, y)| < ke(x, y) for all x, y in the half-strip, for any positive 
choice for €. To see this, we note that 


—ke(x, y) < h(x, y) < ke (x, y) (3) 


for y = 0,x in [~1, 1], since A has been so chosen. Furthermore (3) holds for 
‘y > 0,x = +1, since by hypothesis (i) A is zero there, while ke is positive. Finally 
we note that (3) holds for y = Y large enough, |x| < 1, since by hypothesis (11) 
h grown at most as e!”, while ke grows as e", and m > £. Applying the classical 
maximum principle to the rectangle 0 < y < y,—1 < x < 1, we conclude that 
(3) holds inside the rectangle as well. Since Y is arbitrary, (3) holds in the entire 
half-strip for any positive value of e. Letting e tend to zero, we obtain that 


|h(x, y)| < Ae T™ 


in the entire half-strip. This proves that h decays exponentially as y tends tooo. O 


26.2 AN ABSTRACT PRAGMEN-LINDELOF PRINCIPLE 


This section is devoted to the statement and proof of a sweeping generalization due to 
the author of the Phragmén-Lindeldf theorem stated above. It is based on a functional 
analytical abstraction of the space of harmonic functions. 


Definition. Let X be a Banach space over C, S a linear space of locally integrable 
functions u(y) on the positive reals y > 0 whose values belong to the Banach space 
X. We impose the following two conditions on the space S: 


(i) S is translation invariant, that is, if u(y) belongs to S, so does u(y + t), t any 
positive number. 


Gi)Define the-norms-ie|2-as 


lw! = J KON (4) 


where 0 < a < b. We require that the unit ball in S in the |u|? norm be 


~precompact in the [u|2°-norm whenever [ap, bo] is a compact subinterval of 
{a, b], thatis,a < ao < bo < b. 


286 HARMONIC ANALYSIS ON A HALFLINE 


A similar definition can be given in terms of the L?-norm 


b 1/p 
lu? = ( f konas) 4’) 


in place of the L'-norm. 


———- — Property (i1)-is-called-interior compactness. - . pe eena 


If we adjoin to S all limits of sequences of fnetionsd in S that converge in the L! 
sense on compact subsets of R+, the extended space retains both properties (i) and 
(ii). Therefore we may assume without loss of generality | that (ili) S is closed i in the 
sequential topology described above. — 


Example 1. Take X to be the space of continuous functions on [—1, 1] that vanish 
at the endpoints. Take S to consist of all functions u(y) = A(x, y), where h is any 
harmonic function in the half-strip x € [—1, 1], y = 0, continuous up to the bound- 
ary and equal zero for x = +1. It is not hard to show that the space S defined in 
example 1 has properties (i) and (ii). 


Example 2. Let L be a linear elliptic operator of order 2n, whose coefficients are 
independent of one of the variables, call it y. Let G be a domain in the space of the 
remaining variables, call them x, whose closure is compact and whose boundary is 
smooth. Take the Banach space X to be the space Hj of functions in G with square 
integrable derivatives of order n, whose derivatives of order i = 0, 1,...,n — 1 are 
zero on the boundary of G. Take S to consists of all functions of the form u(y) = 
h(x, y), where A(x, y) is any solution of Lh = 0 in the half-cylinder G x Ri whose 
values for fixed y belong to X. 


Using the theory of elliptic equations we can show that a space S defined above is 
interior compact. 


Example 3. X = C, S the space spanned on Rx by a set of exponential functions 
{e7#")}, where the un are real, positive, and Euz! < oo. According to Miintz’s 
theorem (see chapter 9) S is a proper subspace of the space of continuous functions 
on R+ that vanish at oo. These spaces were studied by Laurent Schwartz; the interior 
compactness of such a space follows from properties established by Schwartz. 


We state now our abstract Phragmen-Lindeldf principle. 
Theorem 1. Let S be a translation invariant, interior compact space of vector- 


valued functions on R+. Then there exists a positive number c, depending only on 
the space S, such that every function u in S that satisfies 


foe) 
| lucy) dy < 00 


AN ABSTRACT PRAGMEN-LINDELOF PRINCIPLE 287 


also satisfies 
[oe] 
i |u(y)|e%dy < oo. 
0 


Proof. Denote by K the subspace of S consisting of integrable function u(y); K 
is a Banach space under the L!-norm 


oo 
Jul = Í ju(y)|dy. (5) 


Denote by T(t) the translation operator acting on K : 


(T(t)u)(y) = u(y +t). (6) 


Clearly, the operators T(t) are contractions, that is of norm < 1. Therefore the spec- 
trum of each T(t) lies in the unit disk. The key to theorem 1 is to show that the 
spectrum of T(t), t > 0, lies inside the unit disk. 

The proof is somewhat elaborate and is based on 12 lemmas and 2 propositions. 
In what follows T denotes any of the operators T(t), ¢ > 0. 


Proposition A. Every nonzero boundary point À of the spectrum of T that lies inside 
the unit disk: |A| < 1 and à Æ 0, is an eigenvalue of finite multiplicity. That is, 


(i) (T — à) has a nontrivial nullspace of finite multiplicity. 
(ii) The space of all generalized eigenfunctions is finite dimensional. 


(iii) Denote by N the space of all generalized eigenfunctions. The mapping T — À 
of K/N into K/N is invertible. 


(iv) à is an isolated point of the spectrum of T over K. 
Proof. We need a series of lemmas: 


Lemma 1. Let i. be a complex number inside the unit disc, || < 1, and à 0. Let 
{un} be a bounded sequence of functions in K such that for some positive integer k, 


A= lim (T-A)‘un =v. (7) 
n> 


-Then a subsequence of un converges strongly in K, and the limit u satisfies 
(T- aÉ u =v. 


Proof. It suffices to prove the result for k = 1. We have assumed that the sequence 
____._ {un} is bounded: |jun| < M. Therefore 


unl < Junl < M. 


288 HARMONIC ANALYSIS ON A HALFLINE 


Since the functions n lie in an interior compact space, there is a subsequence, again 
1 . 2 
denoted as {u,,}, that converges in the |u|;'-norm: 


2 
[un — uml; —> 0. (8) 


We claim that the sequence {un} converges in the |u|g-norm for any a. For according 
to (7) with k = 1; Tujy~ Au converges inthe-L!-norm-:Since (Tun) (y) = un (y +t) 


~a and since.A__0,.it follows from_this .and_.(8) that {un} converges in the | [p-norm; 


—-——it further follows that {w,,} converges in the | 43t-norm. Repeating this argument.N.... . 


Ag 


times we conclude that {un} converges in the | In '_norm, as asserted. 
To complete the proof of lemma 1, we show that the functions un have uniformly 


~-small L! norms-near oo. We take the algebraic identity 


m i 
T” = 4m 4. Yoam lp — 1) 
j=l 
and apply it to un. Denoting (T — A)un = vn, we get 


m 


T” un = Aun + OAIT o. (9) 
1 
By hypothesis (7), un = v + en, where |en| —> 0. We rewrite the identity above as 
eaa PES m a . m r$ > cs 
T” un = A” un 4 De ae y 4 Se en. (9') 
1 1 


We claim that as m and n tend to oo, the right side of (9’) tends to zero in the norm 
of K. The last group of terms can be estimated as 


m 


z * > È e 
(POAT en] < YO Aen] < a 
l 


To estimate the next group of terms, we note that |[T/—!v| = d jj tends to zero as j 
tends to oo; therefore 


m 


ee Aes aa < do lal 4d; 
] 


tends to zero as m tends to oo. Finally |A” u„] < |A|” M tends to zero as m tends 
to co. So we conclude from (9’) that given any positive €, 


[Tun] < € 


for all n and m large enough. Since 


co 
IT" u| = f lu (y)dy, 
I 


nt 


AN ABSTRACT PRAGMEN-LINDELOF PRINCIPLE 289 


this proves that the functions un are uniformly small at co. Hence the subsequence 
{un} converges in the L'-norm not only over finite subintervals but on the whole 
positive axis. g 


Lemma 2. Let T be any bounded map of a Banach space K over C into K, and let 
À be a boundary point of the spectrum of T. 


(i) There exists a sequence of unit vector un such that 
I(T — A)un| > 0. (10) 
(ii) The image of K under T — x is a proper subspace of K. 


Lemma 2 is a restatement of theorem 2 of the chapter 20. w 


We turn now to proving part (i) of proposition A. Let à be a boundary point of 
the spectrum of T, |A| < 1 and à Æ 0. According to part (i) of lemma 2, there is a 
sequence of functions {un} in K, |n| = 1, that satisfies (10). According to lemma 1, 
a subsequence converges to a limit u that satisfies (T — A)u = 0. Since a limit of 
unit vector is a unit vector, u is an eigenvector; this shows that A is an eigenvalue. 
Denote by N; the space of eigenvectors with eigenvalue à; we claim that it is finite 
dimensional. To see this, we note that (T — 4) = 0 means that 


u(y +t) = Au(y). 


It follows that for an eigenfunction u, 


3t 2t 
je = [ u(y) dy = (al! 41-4 ap J u(y)ldy, 
t 
= (JATE + 1+ [ap lul?". 


This shows that for eigenvectors the | [et norm, and the | Fig norms are equivalent. 
But the unit sphere-in the former norm-is-by-interior compactness-precompact-in-the- 
latter norm; that makes the unit ball of N; in the | Re -norm precompact. But then 
according to theorem 6 of chapter 5, Nj is finite dimensional. This completes the 
proof of part (i) of proposition A. 
-— Weturrnow-to-part{i)-eoncerning-generalized-eigenfunetions-defined-as-the————- 
armam --nullspace-Nj-of-the-operator-(E-—A)£ yk-=-15-2;-:..:+-As-already-noted-in-chapter-2,... 
__theorem 2, if N is finite dimensional, then all the nullspaces Nz are finite dimen- 
sional, and furthermore their dimensions satisfy the inequality 


dim Ng — dim Ny_; > dim Ng+1 — dim Nr. (11) 


This inequality derives from the fact that (T — 4) maps Ng4-1/Nk one-to-one into 
Nx /Ng-1- 


ae 


290 : HARMONIC ANALYSIS ON A HALFLINE 


Lemma 3. There is an index i such that the sign of equality holds in (11) for k >i: 


oN = = dit NEJ Nki, k >i. (11’) 
Proof. A jonnekin sequence ae nonnegative integers such as dim Ng41/Ng 
eventually consists of the same numbers. _ 


=== REMARK IiI follows" from CIT" ythat fork Si, THA maps NEFT NE € One-to-one 


-onto We NET: 


Lemma 4. Wie Nx defined as above denote their union by N : N = UNrp. 


~ (i) T—d maps N/N; one-to-one onto N/Nj-. 


(ii) (IE —A)7! is a bounded map of N/N;—, onto N/N;. 

Proof. Part (i) is purely linear algebra: 
Exercise I. Prove part (i) of lemma 4. 

(ii) If (T — A)! were unbounded, there would be a sequence U, in N/N;, 
|U,| = 1 such that (T — 4)U, = Vp is a nullsequence in N /Nj~1. Since |U,| = 1, 


there is an element un in the coset U, satisfying Jun| < 2. On the other hand 
[Vn] — 0 means that we can write 


$ 


(T — à)un = Un + Zn, (12) 


where zn € Ni—1, |un| —> 0. Applying (T — a)'—! to (12), we get 


(T — A) un = (T —1)i lup. (12') 


Since |un| —> O and (T —A)!~! is a bounded map, the n norm of the right side of (12’) . 
tends to zero; therefore so does the left side: 


lim |(T—A)‘u,| =0. (13) 
n> CO 


We appeal now to lemma 1: since [un] < 2 and (13) is relation (7) with v = 0, it 
follows that subsequence of un converges a limit u that satisfies 


(T—Ajiu =0. 


Such a u belongs to N;. Recalling the definition of norm in a coset mod N;, we 
conclude that 


lUn] < [un — u| > 0 


AN ABSTRACT PRAGMEN-LINDELOF PRINCIPLE 291 


for a subsequence. But this contradicts our assumption that |U;,| = 1 for all n; there- | 
fore assuming that (T — A=! is unbounded leads to a contradiction. o 


Denote the closure of N in K by N. Since N;—, and N; are finite dimensional 
and thereby closed, N/N;— A and N/N; are the completion of N/Ni—; and N/N;, 
respectively. Since (T — A)T! is bounded, it can be extended as a bounded mapping 
of N/Ni—1 to N/N;. This shows that (T — A) maps N/N; one-to-one onto N/Nj-1. 


Lemma 5. à is not in the spectrum of T as a map of K/N into K IN. 


Proof. We argue indirectly and assume the contrary. We have assumed that A is 
a boundary point of the spectrum of T over K. That means that there is a sequence 
of complex numbers 4n — A that belong to the resolvent set of T over K. Suppose 
that infinitely many of these belong to the resolvent set of T over K/N; then if A 
belonged to the spectrum of T over K/N, there would be, according to part (i) of 
lemma 3, a sequence {Un} of cosets mod Ñ such that [On| = 1, (T —A)U,| — 0. 
Let un be an element of the coset U,,, [un] < 2; then 


(T —d)utn = Un + Zn, (14) 


where z, belongs to N and lun| — 0. It follows from (14) that izn] < 3. As observed 
above, (T — a! maps N/N;—ı boundedly onto N/N;; this means that there is an 
element wn in N and a constant c such that |w,| < clz,| < 3c, and 


(T —A)wy = znl mod N;_1). 
Subtracting this form (14) yields 
(T —A) (utp — Wn) = vn (mod N;_1). 
Applying (T — A)'~! to both sides; we have 
(T —A)! (un — wn) = (T-d)'"!o, - 14’) 


Since (T — A)/z! is a bounded ater the r norm of the tight side-of-(14") tendsto---—-— 
zero; therefore so does the left side: . 


lim_|(T —A)! (un — wn)| = 0. (15) 
>00 


Since [uy —Wr| <u |+|wiissuniformly-bounded;we-can-appeal-tolemma-l-and:-~=-="~ 
_ conclude that a subsequence of un — wn converges to a limit u that satisfies 


(T- Ayu = 0. 


Such a u belongs to N;; according to the definition of the norm of cosets 


Onl < lun — wn — ul — 0, 


292 HARMONIC ANALYSIS ON A HALFLINE 


in contradiction to |U,| = 1 for all n. Therefore it must be the case that all but a 
finite number of the yn belong to the spectrum of T over K/N. Since Hn belongs 
to the resolvent set of T over K, T — un maps K onto K, and therefore K /N onto 
K/N.So the only way 4, can belong to the spectrum of T over K/N is for up to be 
an eigenvalue. That is, there exists a coset Un, |Un| = 1, such that (T — un)Un = 0. 


~ SINCE [Ly —>-A;it follows:that |(T-— 2)U,| —> 0; so we are back where we were 
....0¢fore, heading into a contradiction. We got into this contradiction by supposing— 
O 


erroneously —that that A belongs to the spectrum of T over K/N. _ 


Lemma 6. A belongs to the resolvent set of T over K/N;—}. 


Proof. Lemma 5 shows T — A maps K_onto.K/N;, previously.we have deduced _ 


from lemma 4 that T — à maps N onto Ñ /Nj-1. Combining these statements, we 
conclude that T — à maps K onto K/N;_ 1, and therefore that T — A maps K/N;~; 
onto itself. 

We have assumed that A is a boundary point of the spectrum of T over K and 
therefore that A is a limit of a sequence of points 4n Æ A in the resolvent set of T 
over K. Since N;_ is finite dimensional, it follows from exercise 1 of chapter 25 that 
Hn belongs to the resolvent set of T over K/N;—1 as well; therefore if À belonged 
to the spectrum of T over K/Nj—1, it would be a boundary point of the spectrum. 
According to part (ii) of lemma 2 the image of K /N;— under T—A would be a proper 
subspace of K /N;—;, but this contradicts the fact established above that T — A maps 
K/Ni—; onto itself. i O 


Since À is in the resolvent set of T over K/N;_1, it follows that A is not an ei gen- 
value, that is, that T—A maps no element of K notin N;—; into N;_). This shows that 
Ni = Nj-—1; thus the space of generalized eigenvectors N; is of finite dimensions. 
This proves parts (ii) and (iii) of proposition A. 

Part (iv), that A is an isolated point of the spectrum of T over K, now follows 
as in the classical Riesz theory. Since the resolvent set of T over K /Ni-1 is open, 


„all complex numbers jz sufficiently. close to.A are in the resolvent set of T over 


K/N;~,. On the other hand, the spectrum of T over N;— consists of the single 
point A. Since the resolvent set of T over K is the intersection of its resolvent set 
over N; and K/N;, À is an isolated point of the spectrum. This completes the proof 
of proposition A. o 


Proposition B. The spectrum of T is a discrete set of points accumulating only at 
the origin. 


The proof rests on four lemmas. 


Lemma 7. Let À be a complex number in the spectrum of T over K with absolute 
value I: |à] = 1. Then there is an eigenfunction v in S Satisfying 


u(y +t) =Av(y). (16) 


The eigenfunction v lies in S, not in K. 


. AN ABSTRACT PRAGMEN-LINDELOF PRINCIPLE 293 


Proof. Since the spectrum of T over K lies in the closed unit disk, every point A of 
absolute value 1 in the spectrum of T is a boundary point of the spectrum. Therefore 
according to part (i) of lemma 2, given any nullsequence {€,} of positive numbers, 
say €n = 1/n?, there is a sequence {un} of functions in K for which the quotient 


I(T — A)un| 


jtn] 


< €n. (17) 
For any positive number a we can rewrite (17) as 


k+l 
Dy MCT — Ayn ETP 
Ve lees 


It follows that for each n there is an integer ky such that 


< En. 


I(T — Aun lint , 
‘ rn Ê En. (17) 
We define vy as 


Un(Y) = Cnun (y + kna), 
Cn a constant so chosen that 
[unlo = L, (18) 
where ¢ is the amount of translation by T(r). (17’) can be rewritten as 
IT —A)unlg < €nlonl6. (17%) 
We choose now a.= nt, and denote 


An = |Un I". 


x 


Junik! < k+ (k= I)enAn. a8) 


For k = 1 this is the normalization (18); for k > 1 it follows inductively form (17”). 


We claim that for all integers k <. n, .....-. Ie aan, Feats Eo ile 


_Sov we have for k= =n that 


EREE An = [vnl < n+ (n -— len An. 


Since we have chosen €, = 1/n?, it follows that An < 2n. Setting this into (17”), 
we get 


2 


KT —A)unlg’ < =. (18”) 
n 


` 294 HARMONIC ANALYSIS ON A HALFLINE 


Now we proceed as in the proof of the first part of lemma 1 and use (17”)'and (18y = 
to select a subsequence of {vn} that converges to a limit v in the norm [1b b any 
positive number. Since S is closed in this topology, v belongs to S and satisfies (16). 

It follows from the normalization (18) that v Æ 0. ` Be 


Lemma 8. For t small enough, à = —1 does not belong to the spectrum of T(t). 
_. Proof. Suppose, on the contrary, that —1 lies in the spectrum of T(r). Then ac 
cording to lemma 7 there is v in S satisfying 


vO +t) =w) 


As £ tends to zero, these functions v; oscillate more and more rapidly, and clearly 
violate interior compactness. Hence 4 = —] does not belong to the spectrum of 
T(r) for t small enough. m 


Lemma 9. The spectrum of T(t) in0 < |À] < 1 is a discrete set of points accumu- 
lating at most at the origin or on the unit circle. 


Proof. We have shown that for t small enough A = —1 belongs to the resolvent 
set of T(t). Therefore some open set around 2. = —1 belongs to the resolvent set. 

According to proposition A, the boundary points of the spectrum of T(r) inside 
the unit circle form a discrete set of points that acumulate at most at the origin or 
on the unit circle. We claim that all points of the spectrum inside the unit circle 
are boundary points. Suppose on the contrary that there were a point that is not a 
boundary point. Such a point could be connected to A = —1 by a polygonal path that 
avoids the discrete set of boundary points, but that is a contradiction, for such a path 
must contain a boundary point. This proves lemma 9 for ¢ small enough; for any + it 
follows from T(t) = T” (t/m) and the spectral mapping theorem. 

Let p be any positive number; denote by S“”) the set of functions u (P) of the form 


uP) (y) = e Pu(y), uin S. (19) 


Clearly, S (P) is translation invariant, and it is interior compact, since for a and b 
finite, 


eT PPO < ju PE < eP ujb, (19°) 
We define the space K (P? as the subspace of $“”) consisting of integrable functions 
u(P); K‘P) is a Banach space under the L!-norm. 
From here on T denotes the unit translation T(1). 


Lemma 10. Jf 2 belongs to the spectrum of T acting on K, and p is a positive 
number, then he~? belongs to the spectrum of T acting on KP), 


ur 


AN ABSTRACT PRAGMEN-LINDELOF PRINCIPLE 29 


Proof. Recall that every À in the spectrum of T over Ķ is an eigenvalue in the 
sense that there is an eigenfunction u satisfying 


u(y + 1) = u(y), 


For |A| < 1 the eigenfunction u belongs to K, and for |A| = 1, u belongs to S. 
Clearly, u(P? = e7 P?u belongs to K‘?), and satisfies 


uP) (y +1) = eT POTD uy 4:1) = Ae PUP) (y). 


This shows that w'?) is an eigenfunction of T over K‘?) with eigenvalue àe™?P. O 


According to lemma 9 applied to the space K‘?), the spectrum of T acting on 
KP) has no points of accumulation inside the unit circle other than zero. Combining 
this with lemma 10, we conclude that the spectrum of T acting on K has no point of 
accumulation on the unit circle, as asserted in Proposition B. E 


We are now ready to prove theorem 1. We only need a couple of lemmas, the first 
an extension of lemma 10: 


Lemma 11’. Denote by =?) the spectrum of T acting on the space KP), Let p and 
q be two positive numbers, p < q. Then 


eP-DZP c W, (20) 


The proof is the same as that of lemma 10 after we observe that S% = 
e74 SP), 

According to proposition B, © is a discrete pointset in the unit disk accumu- 
lating only at the origin; it follows from (20) that ©?) contains a point on the unit 
circle for only a discrete set of values of p, p < q. Choose p to be different from 
these discrete set of values, and arrange the eigenvalues A; of T acting on K (P) in 
decreasing order in absolute value: 


1> Ml > hal >... — 0. 


=l fe-nn 
res Pe poeta are (eine, eee Pj nm Deri J£ T) dé, 


Lemma 12. 
(i) The operators Pj are pairwise disjoint projections: 


P? = Pj, P;Pk=0 for j #k. 


296 HARMONIC ANALYSIS ON A HALFLINE 


(ii) Each P; commutes with T. 


(iii) The range of P} is the space of generalized eigenfunctions of T with eigen- 
value Àj. 
J 


Exercise 2. Prove lemma 11. 
~ Note that according to proposition A the range of each P; is finite dimensional. 
——-—--- Denote. by--K iP. the intersection-of-the nullspaces of -Pis -.,Pm-1; K AP) isin- a 
variant under T. Denote by Tp, the operator T acting on K {p ) . The spectrum of 
Tp,m consists of the eigenvalues Àm, Am+1--.. According to the formula for spec- 
tral radius (see theorem 4 of chapter 17), __ 


: k 1/k 
im (Tp ml* = lm. 
(—> 00 


It follows that given any positive e€, for k large enough 
ITE ml < (Am + ey, (21) 


Every uP) in KP) can be decomposed as 


m—l] 
uP) = X fptu, (22) 
1 
where fj = Pju”, and v belongs to KP., For any u in K, e7Pu = u) belongs to 
KP); using the decomposition (22), we can write 


u = EeP fj tehu | (22) 


Now choose the integer m so large that |A»| < le, and choose € < le™P . It 
follows from inequality (21) that for k large , 


k+l k+l i k+l k 
f eP) |u(y)idy < eP K+) l ludy < ePČHDITE „ol 


k k 
< eP%+D (Je?) ul = eP ($) lol. co) 


This shows that the last term on the right in (22’) is integrable. We claim that the 
rest of the terms eP? f; decay exponentially, for they are eigenfunctions or general- 
ized eigenfunctions of T with eigenvalues A je”. All of these eigenvalues must be 
less than 1, for otherwise they and their sum would not be integrable. This would 
contradict the fact that u belongs to the space K. 

Clearly, we can choose c positive and so small that the conditions 


e*|Ajle? <1, 5e° <1, 


ASYMPTOTIC EXPANSION 297 


are fulfilled for all Àj that satisfy |A; |e? < 1. It follows then from (22’) and (23) that 
e°u(y) is integrable. This completes the proof of theorem 1. g 


26.3 ASYMPTOTIC EXPANSION 


We show now how to use the results derived for the proof of theorem 1 to give an 
asymptotic description of functions contained in the space K. 


Lemma 13. 


(i) There is a basis for the eigenfunctions of T(1) over K consisting of exponen- 
tial functions, that is, functions of the form 


ew, Reu <0. (24) 


(ii) There is a basis of the generalized eigenfunctions of T(1) over K consisting 
of exponential polynomials, that is, a sum of functions of the form 


yeh wp. (25) 


Proof. The translation operators T(t) commute. It follows that an eigenspace of 
T(1) is an invariant subspace of all the translations T(t). It follows from this that 
there is a basis for the eigenfunctions of T(1) consisting of eigenfunctions v for all 
the translations: 


T(t)u = A(f)v. (26) 


Since v belongs to K, it is an integrable function of y. Therefore T(f)v is a continu- 
ous function of ¢ in the L'-norm. This shows that A(t) is continuous. Since transla- 
tions satisfy 


T(s +t) = T(s)T(s), 
it follows from (26) that 
E Ms +t) =A(S)A(t). 


The only continuous solutions of this equation are A(t) = e#'. Since the operator 
T(t) is norm decreasing, it follows that Re u < 0. This proves part (i) of lemma 12; 


part (ii) can be argued similarly. 0 


Denote by {uj} the set of all yz that appear in an eigenfunction (24) of 'T(1)-It 
follows from proposition B that the real parts of u j tend to —co. 


Theorem 2. Let S be as in theorem I, and K the space of those functions in § that 
are integrable. Every function u(y) in K has an asymptotic expansion of the form 


u(y) & $ eej, (27) 


298 HARMONIC ANALYSIS ON A HALFLINE 


where each ej is a finite sum of the form 


ej= yo yt w.k- (27') 
Proof. Decompose u as in equation (22), with p = 0. Use of the formula for the 
“spectral radius proves the asymptotic character of the expansion (27). O 


ple 2 of elliptic.operators of.order.2n.of.form... ....-- . 


2n , 
L=) Aja, (28) 
0. PAEAN ae etd ale Saget te 
where A; = Aj(x, 0x) are linear partial differential operators of order j in the 
variables x,,...,2Xm, with coefficients that may depend on x but not on y. Ellipticity 


means that 


LE, n) = J Aj (x, En > ele” + 9") 
for all x, c some positive constant. A typical example is 
Lo = a + AÑ. 

Let G be a smoothly bounded domain in x-space, whose closure is compact. We 
take S to be the space of all solutions u = u(x, y) of Lu = 0 in the half-cylinder 
G xR 1, which satisfy so-called coercive boundary conditions on 3G. We take these 
boundary conditions the same for all y; the simplest example is Dirichlet boundary 
conditions. We regard these solutions as functions u(y) whose values belong to the 


Banach space of functions of x in G, satisfying the boundary conditions, normed by 
“the Sobolev norm — 


f J fu} adx. 
G a<2n 


It follows from the theory of elliptic equations that the space S of these functions 
u(y) form an interior compact space. 

Consider all solutions of Lu = 0 in the half cylinder G x R + whose dependence 
of y is exponential: 


u(x, y) =e w(x). (29) 


Setting this in (28), we conclude that w satisfies the equation 


(Za i)e =0, (30) 


and w satisfies the prescribed boundary conditions. Applying theorem 2 to this con- 
crete situation yields the following result. 


BIBLIOGRAPHY : 299 
Theorem 3. 


(i) The set of values uk for which (30) has a nontrivial solution w satisfying the 
boundary conditions is a discrete set in the sense that each strip 


a<Repy <b 


contains only a finite number of pk. 


(ii) Each solution of Lu = 0 that is integrable in the half cylinder has an asymp- 
totic expansion of the form 


ua Yew), Reje <0, 
where wz(y) is a polynomial in y. 
Exercise 3. Take 
L = (85 + 32, 
G the interval [0, 7], and u subject to Dirichlet boundary conditions 
u =u, =0 when x = Oorz, 


Show that u in (30) satisfies the equation tan? ur = (ur)? /{it (usr). 


BIBLIOGRAPHY 


Lax, P. D. A Phragmén-Lindeléf theorem in harmonic analysis and its applications to some questions in 
the theory of elliptic equations. CPAM, 10 (1957): 361-389. 


Schwartz, L. Etude des sommes d’exponentielles, 2nd ed. Actualités scientifiques et industrielles 959, 
Hermann, Paris, 1959. 


a 


21 


INDEX THEORY 


We start with stating the main results of indéx theory in chapter 2. Let U, V be linear 
spaces, in general, infinite dimensional. A linear map T : U — V is said to have 
finite index if it has these properties: 


(i) The nulispace Ny of T, is a finite-dimensional subspace of U. 
Gi) The quotient space V/Ry, Rr the range of T, is finite dimensional. 


For such an operator we define the index as 


ind T = dim Ny — codim Ry. (1) 


A map G from one linear space into another is called degenerate if its range is 
finite dimensional. We recall from chapter 2 the following results: 


Theorem A. A linear map T : U — V has finite index iff T has a pseudoinverse, 
that is, a linear map S : V — U such that 


ST=I+G, TS=I+H, (2) 


where I denotes the identity.in U and. V respectively, and G, H are degenerate maps. 


Theorem B. Let T : U —> V and R : V — W be linear maps with finite index. 
Then their product RT has finite index, and 


ind RT = ind R + ind T. (3) 


Theorem C. Let T : U —> V bea linear map with finite index, and G : U —> V a 
degenerate linear map. Then T + G has finite index, and 


ind (T + G) = ind T. (4) 


300 


THE NOETHER INDEX 301 
27.1 THE NOETHER INDEX 


In this chapter we present a corresponding analytic theory dealing with bounded 
linear maps of one Banach space U into another V. As before, T has finite index 
if it has properties (i) and (ii). Condition (ii) implies, according to the closed graph 
theorem, theorem 14 of chapter 15, that the range of T is closed. 

The natural concept of pseudoinverse in this context is the following: 


Definition. Two bounded linear maps T : U —> V andS : V -> U are called 
pseudoinverses of each other if 


ST=1+K. TS=I+H, (5) 


where K and H are compact maps of U, respectively V, into themselves. 


The analogues of theorems A, B, and C hold in the Banach space context, with very 
much the same proofs. There are three additional results with no counterparts in the 
linear algebra context. 

We will use repeatedly the basic result theorem 5 of chapter 21: 


Theorem 0. Let K : U — U be a compact map. Then I + K has finite index, and 
ind(i+K)=0 (6) 
The Banach space analogues of theorems A, B, and C are theorems 1, 2, and 3: 


Theorem 1. A bounded map T : U — V has finite index iff T has a pseudoinverse 
in the sense of definition (5). 


Proof. It is easy to verify that the pseudoinverse S constructed in chapter 2 for 
theorem A can be chosen to be bounded in case T is a bounded map. Since degenerate 
maps are compact par excellence, (2) implies (5). Conversely, if (5) is satisfied, it 
follows that the nullspace of T is contained in the nullspace of I + K, JIERS the 


range of T contains the range of I +--H-Since-according-to-theorem: K anid pees 
s A A TE T 


I+ H have finite indices, ` 


dim Nr < dim Nī+}ķ < œ, 


codimRy < codimRy:y < oo. 
A special case of theorem B is 
Theorem 2. Let-T:U-~» -V and RV- =*-W~be bounded-muaps with finite index. ~ 


Then their product RT has finite index, equal to the sum of the indices of R ahd 
of T. 


302 INDEX THEORY 
From theorem 2 we can deduce 
Theorem 2’. If T and S are pseudoinverses to each other, 
indT = —indS. 1) 


- Proof. Applying the multiplicative law (3) to (5) we deduce that 


According to theorem 0, ind (I + K) = 0, so (7) follows. o 


The next result is due to Yood: 


Theorem 3. Suppose that T : U — V has finite index, and L is a compact linear 
map U —> V. Then T + L has finite index, and 


ind (T +L) = indT. . 


Proof. Since T has finite index, it has pseudoinverse S. Obviously S is also pseu- 
doinverse to T + L, for 


S(T + L) = ST+ SL = 1+ K +SL, 
and since L is compact, so is SL. So, by (7), 
ind (T + L) = —ind S = ind T. O 


Every bounded linear map T : U — V has a transpose T' : V’ —> U’, defined by 
the relation 


(Tu, £) = (u, T2). y 
Theorem 4. Let T : U > V have finite index; then so does its transpose T', and 
ind T' = —ind T. (9) 
Proof. Let S be a pseudoinverse of T; taking the transpose of (5) gives 
TS = +K, ST =l +H. 
Since according to theorem 7 of chapter 21 the transpose of a compact operator is 
compact, it follows that T’ and S’ are pseudoinverse to each other. This proves that 


T’ has finite index. Next we show that 


dim Ny =codimRr and codimRy = dim Nr. (10) 


woe nd T +indS =ind AAK). 0 eee 


THE NOETHER INDEX 303 


It follows from (8) that the nullspace of T’ is the annihilator of the range of T; 
this proves the first part of (10). Similarly (8) shows that the nullspace of T is the 
annihilator in U of the range of T’. This would prove the second part of (10) if U 
were reflexive; otherwise, it only shows that 


codimRy > dimNy. (10 
Subtracting the first part of (10) from (10°) gives 
—indT’ > ind T. (G9) 
By (11), applied to S, 
~indS’>indS. (115 
Add this relation to (11): 
—ind T’ — ind S > ind T + ind S. 


According to (7), both sides of the above inequality are zero. However, this can only 
be if equality holds in both (11) and (11), and therefore also in (10’). This completes 
the proof of (10); relation (9) follows. 


The next result, due to Dieudonné, is called the stability of the index: 


Theorem 5. If T : U — V has finite index, there is € > 0 such that for all bounded 
linear maps M : U — V satisfying |M] < e, 


ind (T + M) = ind T. (12) 
Proof. Let S be a pseudoinverse to T; then by (5) 
SS eae Peer - (43) 


Choose € = |S|~!; then |SM| < 1, „50 by theorem 2 of chapter 17, 1+ SM i is 
invertible. Multiply (13) by (I+ SM)~! on the left: 


(1+ SM)~'S(T +M) =1+ (I +SM)7!K. 


This shows that (I + SM)~'S is pseudoinverse to T + M; so, by (7) 
‘ind (T +My= —ind(+SM)"!s. - > (13) 


Since (I+ SM)~! is invertible, multiplication by it doesn’t change the index: 


ind (1 + SM)~!§ = ind. gione ety Pasa 


Setting this into (13) and using (7) gives (12). E 


[Merete stems es Se gene E E 


304 INDEX THEORY 


Theorem 5 can n be reformulated as follows: 


.. Theorem 5’. y: T has finite index and {Tp} a sequence that tends to T in the norm 
topology, then for n large enough, T, has finite index, and 


` im, ind dT, = = miie 


< „z2 OO 


“A direct consquence of theorem 5” is 


Theorem 5”. Let T(t) be a one-parameter family:of mappings U > V,0<t <1. 


____" Siippose that for each t, T(2) has finite index and that T(t) depends continuously on... ... 


t in the norm topology. Then ind T(t) is independent of t; in particular, 
ind T(0) = ind T(1). 


This result, called the homotopy invariance of index, is enormously useful in calcu- 
lating the index of mappings. We will give examples later on. . 

It is easy to see that the index is not continuous in the strong topology for maps. 
As an example, take U = V = 7, For u = (a), a, ...) define 


Thu = (aj, A2, -3 an, 0, n+l: ac De 
Clearly, Ny, = {0}, Rr, = {vlup4) = 0}, so indT, = —1. But s —limT, = I, 
whose index is zero. However, H6rmander has shown the continuity of the index for 


strongly convergent sequences of maps under an additional assumption that is easily 
verified in many cases: 


Theorem 6 (Hérmander). Let Ta : U - V and S, : V — U be sequences of 
bounded linear maps such that 


Sn Ty =I + Kn, TaS =I+ Hn, (14) 


where {K,} and {Hy} are uniformly compact sequences of mappings, in this sense: 
the vectors 


{Ky,u, ju] <1, n=1,2,...} (15) 


belong to a compact subset of U, and similarly the vectors 


{Hyv, |v] < 1, n =1,2,...} (15’) 


belong to a compact subset of V. Suppose that the sequences {Tn} and {Sn} converge 
Strongly: 


s-limT, =T, s-—limS, =S. (16) 


TOEPLITZ OPERATORS 305 


Then 


lim indT, = ind T. (17) 
n= OO 


For a proof, see Hérmander, theorem 19.1.10. 
The following is an immediate consequence of theorem 6: 


Theorem 6’. Let T(t) : U —~ V and S(t) : V — U be bounded linear maps 
depending on a parameter t,0 < t < 1, continuously in the strong topology. Assume 
that S(t) and T(t) are pseudoinverses of each other: 


SOT) =14+K(), Tas) =1+ H(t), 


where the maps K(t) and H(t) are uniformly compact in the sense of (15), (15'). 
Then ind T(t) is independent of t; in particular, 


ind T(0) = ind T0). 


NOTE. Operators with finite index are usually called Fredholm operators. There is 
no historical justification for this. It would be more appropriate to call operators with 
index Æ 0 Noether operators, and their index the Noether index, for Fritz Noether had 
given the first example of such an operator, defined the notion of index, and proved its 
stability, in the context of singular integral operators; see Hörmander and Dieudonné. 
The important multiplicative property (3) is due to Atkinson and Gohberg. 


HISTORICAL NOTE. Fritz Noether was the brother of Emmy Noether; he fled to the 
Soviet Union to escape Nazi terror, and there fell victim to Stalin’s terror. 


We turn now to some examples of operators whose index can be calculated.ex-..... 
plicitly. In chapter 30 we will show that in some cases the index can be expressed as 
the differences of the trace of two operators. 


27.2 TOEPLITZ OPERATORS 


We denote by L?(S!) the space of square integrable complex-valued functions u on 


luj? = J ON do Ane ar 


The functions e!*? form an orthonormal basis; every u in L? can be expanded as 


CO 
u(@) = Yo uk ei, (19) 
e 


306 i INDEX THEORY 


where the Fourier coefficients are given by 


uber aans pete i oe Up = fu ikora. er ; $ (20) 


The Parseval relation holds: a oh Oat ries Bas cena ene 


Ries B at= Sama Yee 


Definition. H4 is the subspace of L?(S!) consisting of functions u whose negative 
“coefficients are-zere:——_— 


uin Hy iff up=0 fork <0. (22) 


Given u in L?, of form (19), we define its projection P4 onto H4 by 


Co 
Pu = 2 ug ef* (23) 
where ug are its Fourier coefficients. It follows from (21) that 
[Pol] = 1. (24) 
The space H— and the projection onto it are defined EEN 


The space H4 consists of boundary values of functions analytic in the unit disk. 
The elements of H— are boundary values of functions antianalytic in the unit disk, 
that is, functions whose complex conjugate is analytic. 


Definition. Let s(@) be a continuous complex-valued function on the unit circle St. 
We associate with the function-s the Toeplitz operator Ts, mapping H+ into H+; 
given by the formula 


Tsu = P4 (su). uin Ay. (25) 


In words, T; is multiplication by s, followed by projection into H. We call s the 
symbol of Ts. Clearly, Ts depends linearly on its symbol: Ts+r = T; + T,. 


When we represent functions of class H+ in terms of their Fourier coefficients, a 
Toeplitz operator becomes a truncated discrete convolution: 


oO 
(Tsu) = > sk-juje k=0,1,.... (25') 
j=0 


Here sn and up denote the nth Fourier coefficients of the functions s and u, respec- 
tively. The semiinfinite matrix in (25’) has identical entries along each of its dexter 


TOEPLITZ OPERATORS 307 


diagonals k — j = const. Such matrices are called Toeplitz matrices, they arise 
naturally in discretizations of partial differential operators (e.g., see S. Parter and 
S. Osher) and in statistical mechanics (see McCoy). 


Theorem 7. Let s be a continuous complex valued function on S', Ts the Toeplitz 
operator with symbol s. Then 


(i) T; is a bounded map of Hi — H4, and 


Ts ll < [slmax- (26) 


(ii) Ifs is nowhere zero on S | then Ts has finite index. 


(tii) 
ind T; = —W(s), (27) 
where W(s) is the winding number of s(8) around the origin. 


Recall that the winding number of a curve s(0) around 0 is the increase in the argu- 
ment of s(@) as @ goes from Q to 2x, divided by 27. Analytically 


Qn ds 
W(s) =Im s(0)7' — do. (28) 
0 d@ 


Proof. (i) Multiplication by s is a bounded operation, with bound |s|max, and P+ 
is bounded, with norm 1. Since T, is the product of these two, (26) follows. 
(ii) We claim that T,-1 is a pseudoinverse of Ts. To verify this, we need 


Lemma 8. For s continuous, 


C=Pis—s 


(29) 
is a compact map of H+ into L?. 


Proof. Since s is continuous, given any € we can approximate s uniformly by a 
trigonometric polynomial se so that 


At a. . |s(@) — se(9)| < € for all 0. (30) 


The mapping Ce = Pise — se annihilates any function u in H of the form u = 

pairs uk eik? M the degree of se. Since these functions form a linear subspace of H+ 

of codimension M, it follows that the range of C¢ has-dimension <Mr tr particulary ~~~ —— 
each Ce is compact. It follows from (26) and (30) that Ce tends to C uniformly in 

the norm. Since the uniform limit of compact maps is compact, (29) is compact. 


308 INDEX THEORY 


We can perform now the pseudoinversion; we e form the product. and get, using 


(29), that 
T-T; = Pis 7 Pas =Pys7!(s + Pys — s) = I+ Pas (CL 
Since C is compact, it follows that T,-1T; differs by a compact operator from the 


identity. Since s and s~! play symmetric roles, it follows that Ts and T, 
doinverses. oe 


1 „are pseu- 


(iii) To actually cc compute the index of T Tor we will deform it continuously by de- 
forming s. We will make use of the following result of topology: 


~~ Lemma 9. Two-members-of the class-of-continuous;-complex-vatued + nowhere ero — 


functions s on S! can be continuously deformed into each other within that class if 
and only if they have the same winding number W (s). 


Proof. That the winding numbers have to be the same follows from the invariance 
of the winding number under deformation. To prove the converse, take first the case 
that the winding number of s is zero. Such a function has a single-valued logarithm, 
log s(@). Deform this function to zero as t log s(@). Exponentiation yields 


(0, t) = ef Es) l>r>0, 


a deformation of s(@) into the constant function 1.- 
Given any s of winding number N, we write it as 


s(@) = ei N@(e-iN@ s(@)). 


The second factor has winding number zero, and therefore can be deformed into the 
constant function 1. So.s(@) can be deformed into eNO N = W(s). 

Analytically the simplest curve that winds N times around the origin is e’ iN@ For 
N positive the Toeplitz operator Ty whose symbol is e' ING is just multiplication by 
oN? Clearl ly, multiplication by e/® has only the trivial nullspace, and its range in 
H4 has codimension N, since it consists of functions of form } 3 N Uke ik Therefore 


ind Ty = —N. (31) 
For N < 0 the mapping Ty = Py e'N@ is onto H+, and its nullspace consists of 
linear combinations of 1, efô, ..., e'N- D8; thus it has dimension N. Therefore (31) 
holds for N < 0 as well. 0 


We have shown in lemma 9 that every s(@) that is 4 0 can be deformed into giN@. 
that is, there is a one-parameter family s(@, t), continuous in 8, t, such that 

5.1) #0, s(0,0)=s(@), s@,1) =eN®. 
Since the winding number W(s) is invariant under continuous deformation, 


W(s) = W(s(0)) = W(s(1)) = N 


TOEPLITZ OPERATORS 309 


It follows from (26) that 
ITs) — Tse) ll = WTscry—seery < lst) — s(t) limax; 


since s(0, t) depends continuously on t, Tyg) depends continuously on ¢ in the norm 
topology. Appealing to the homotopy invariance of the index, theorem 5”, we con- 
clude that 


indT, = ind Ty. 


Combining this with (31) and the identification of N as the winding number of s we 
get (27), g 


td 


In the course of proving theorem 7, we have shown that for the special function 
sy (0) = ei? the dimension of the nullspace of Ty is either 0 or N, depending on 
the sign of N. This turns out to be true for all functions s. 


Theorem 10. Let s be a continuous, complex-valued, nowhere zero function on the 
unit circle S', Ts the Toeplitz operator with symbol s. Then 


(i) If W(s) > 0, then Ts is one-to-one, and its range has codimension W (s). 


(ii) If W(s) < 0, then Ts has a nullspace of dimension W (s), and maps H4 onto 
He, 
(iii) If W (s) = 0, then T; is invertible. 


Proof. We first prove part (iii); when the winding number of s is zero, s has a 
single-valued logarithm: 


5s(@) =exp€(@), €(8) = logs(8). 
We split £ into its analytic and antianalytic parts: ~... - 
l= tł +l, lyin Ay, Cin H. 


We assume at first that s is smooth, say C°°; then so is £ and so are €4 and £. 
Exponentiate: 9 © 7 


i RN ler aa e 
of value of an antianalytic-function.-Both-are-continuous-up-to-the-boundary;-and~ 

nonzero in the closed unit disk. We show now how to invert Ts with the aid of s+ 

and s_. Write 


Tyu = Psu = f, 


u, f in H4. This equation means that 


su=f+g-. g-in H. 


310 ; INDEX THEORY 


Express s as Sis_, and divide by s_: 


Su = sI! f ar a oe 


Since sI! = exp —£_, the product sa} g_ belongs to H_; applying P+ gives sju = 
; Pst! f. Divide by s+: 


apes gt TA p ee 182) 


This shows that s7 Smt is the inverse of Ts. 
_-.--—. . We.turn now_to_parts (i) and_(ii). Denote the_winding number of s by W. The 


function se~”® has a winding number zero; therefore the mapping u > f given by 
Pase "Fy Sf 


is invertible. This is the same as saying that Ts maps er. one-to-one onto H+. 
From this (i) and (ii) follow. g 


We remark that formula (32’) is not only a theoretical tool but a practical method 
for inverting Ts. 
We return now to the case when s is merely continuous, and W(s) = 0. For any 


mo € > 0 we can approximate s uniformly by a smooth function r so that 


|r — S|max < €. (33) 
For e small enough 
Iris — Umax < i (83) 
We draw two conclusions from this inequality: 
(i) It follows from (33’) combined (26) that 
IT-s- Il < 4. - 


It follows from theorem 2 of chapter 17 that T,-1, is an invertible operator. 

(ii) It follows from (33’) that r and s have the same winding number. Since we 
have assumed that W(s) = 0, also W (r) = 0. Since r is smooth, it can be 
factored as in (32): l 


r=rqr, 
where r+ is the boundary value of an analytic function that is nowhere zero 


in the unit disk, and r_ the boundary value of a nowhere zero antianalytic 
function in the unit disk. Therefore W (r+) = 0 = W(r_). 


TOEPLITZ OPERATORS i : ° 311 


We claim that the operator T,-1, can be factored as follows: 
T,-15 = Tazi! = Parz!'sry! = Pirt'PysPyrz! = T,-1TsT,-1 


This is so because the operator P+ to the left of r T acts as the identity; the operator 
P. to the left of s removes an antianalytic function that would have been removed 
by the leftmost operator P4. As observed above, the operator T,-1, on the left is 
invertible, so are the two operators T,„-ı and T,-; on the right, because the winding 


numbers of r+ and r— are zero. It follows herero that the third operator in the 
product on the right, Ts, is invertible too. This proves part (iii) of theorem 10. O 

The proof above is due to Gohberg, who pointed out that it also applies to piece- 
wise continuous functions s, provided that there is some continuous function r such 
that inequality (33’) is satisfied, with any constant on the right less than 1: 


Ir! 


5 —I]max <c c<l. (33”) 
The geometric meaning of the restriction (33”) becomes clear if we multiply both 
sides by |r| and write it as |r (0) — s(@)| < r(@), valid for all 8. If r (@) is the position 
of a person taking a walk around the origin, and s(@) the position of that person’s 
dog, held on a leash of variable length but always less then c|r(@)|, no matter how 
busily the dog jumps around in the circle to which it is confined by the leash, it must 
circle the origin the same number of times as its master. For a full discussion we refer 
to Gohberg and Krupnik, as well as the last chapter in Douglas. 

Krein and Gohberg have extended theorem 7 to n x n matrix-valued functions 
S(@), acting by multiplication on vector-valued functions u(@). For fixed n, we de- 
note by H+ that subspace of the L? vector-valued functions on S! whose negative 
Fourier coefficients are zero. P4 is orthogonal projection onto H4. 

S(@) denotes a continuous matrix-valued function on S! whose entries are 
complex-valued functions. Since S(@) is a bounded function, the matrix Toeplitz 
operator 


Ts =P4S (34) 


is a bounded mapping of H} > H4. 


Theorem 11. Let S(9) denote a continuous, complex matrix valued function on S}, 
invertible at all points of S'. 


E i) Ts: Hp- —> Hy defined by (34) fias a pseudoinverse. l nanea m ns gene 
(ii) Since S(O) is invertible, s (0) = det S(9) is nonzero on § l; we elie that _ 


indTs = —W(detS). 


Proof. Ts-1 is a pseudoinverse of Tg; the proof is the same as in theorem 7. To 
calculate its index, we deform S into something simple, using the following topolog- 
ical result: 


312 l INDEX THEORY 
Lemma 12. Two matrix-valued continuous functions on S! invertible at all points 
es —-can-be-deformed-continuously-into-each-other within this class of CS iff the 


winding number of their determinants are equal. - 


Starting with any S, we deform it into Sy of diagonal form: 


el N o 
seca semanas Leena sanan e mm T SEE a a Sw (y= es 0 Te s =N W (det'S)=* sm i ee T LATES i Genre 


where all diagonal entries but the first are = 1. The matrix Toeplitz operator Ty, 
whose symbol is Sy above, is the direct sum of scalar Toeplitz operators: its index 


= ——— —-can_be.computed-componentwise_by_formula (31): ind Ty = N. Theorem 11 now 


follows from the homotopy invariance of the index. 0O 


ið 
S(@) = E cr) ; 


dim NTs = 1, codim Ry, = 1. 


Exercise 1. Show that for 


The preceding example shows that theorem 10 is false for matrix-valued symbols. 
When S(@) can be factored.as $ = S— S4, S_ antianalytic, S+ analytic, both invert- 
ible at every point of the unit disk, then Ts = iad SZ}. Even when it exists, such 
a factorization can no longer be performed by taking logarithms. A method based on 
solving a system of PDE —s is given in Lax. 

In the proof of theorem 11 we have made essential use of topological notions 
and results. Conversely, notions and results from index theory are powerful tools in 
differential topology; a basic result of this kind is the Atiyah-Singer index theorem. 

An important extension of the theory of Toeplitz operators, replacing S! by the 
real axis, has been given by Wiener and Hopf. An extension to functions of two. 
variables has been given by Strang. Further generalizations of the notion of Toeplitz 
operators are due to L. Boutet de Monvel and V. Guillemin, and to C. Berger and 
Coburn. In the farthest reaching generalization the notion of dimension of nullspace 
and range is taken in the sense of the dimension function in von Neumann algebras. 


27.3 HANKEL OPERATORS 


A companion to Toeplitz operators can be constructed by replacing the projection 
into H+ by projection into H_: 


Definition. Let s(@) be a continuous function on the unit circle S!. We define the 
Hankel operator Hy as the mapping of H+ into H_ given by the formula 


Hsu = P_(su). (35) 


BIBLIOGRAPHY i 313 


If we represent functions of class H+ and H— by their Fourier coefficients, a Hankel 
operator appears as 


(Hyuk = Y seejuj, =k =0,1,... (35’) 


Here s, denotes the (—n)th Fourier coefficient of s, and u j, j = 0, 1,... the Fourier 
coefficients of u. Note that in order to maintain symmetry we have included the 
zeroth Fourier term in both Hi and H_. The semi-infinite matrix representing a 
Hankel operator has identical entries along each sinister diagonal k + j = const. 
Such a matrix is called a Hankel matrix. 


Exercise 2. Show that every Hankel operator is compact. 


Exercise 3. Show that the norm of a Hankel operator Hy satisfies 
Hs || < inf|s — imax, (36) 


where q ranges over all analytic functions in the unit disk that are continuous on 
the unit circle S}, and zero at z = 0. According to a theorem of Nehari, the sign of 
equality holds in (36). 


For further reading turn to chapter 19 of Hörmander. 


BIBLIOGRAPHY 


Atkinson, F. V. The normal solubility of linear operators in normed space. Mar. Sbornik, N.S., 28 (1951): 
3-14. 


Berger, C. A. and Coburn, L. A. Toeplitz operators and quantum mechanics. Funct. Anal., 68 (1986): 
273-299. 


Boutet de Monvel, L. and Guillemin, V. The spectral theory of Toeplitz operators. An. Math. Studies, 99 
(1981). i 


Böttcher, A. and Grudsky, S. M. Toeplitz Matrices, Asymptotic Linear Algebra and Functional Analysis, 
Birkhäuser, Boston, 2000. ; 5 


Dieudonné, J. Sur les homomorphismes d'espace normeés. Bull. Sci. Math, (2), 67 (1943): 72-84. 


Dieudonné, J. The index of operators in Banach spaces. Integral Eq. Oper. Theory, 8 (1985): 580-589. 


_—_——-Douglas,-R.-G.-Banach Algebra-Techniques_in Operator.Theory, 2nd_ed..Graduate Textsin Mathematics, ——- 


179. Springer. New York, 1988. 
Gohberg, I. C. On linear equations in normed space. Dokl. Akad. Nauk SSSR (N.S.), 76 (1951): 477-480. 


Gohberg, I. C. and Krein, M. G. Systems of integral equations on a half line with kernels depending on 
the difference of arguments. AMS Trans., 14 (1960): 217-288. 


Gohberg, I. C. and Krupnick, N. Ja. The algebra generated by Toeplitz matrices. Funct. Anal. Appl.. 3 
(1969): 119-137. 


Hormander, L. The Analysis of Linear Partial Differential Operators II, Springer, New York, 1985. 
Lax, P. D. On the factorization of matrix valued functions. CPAM, 29 (1976): 683-688. 


_ (1962): 244-257. 


314 INDEX THEORY 


McCoy, B. M. Introductory remarks to Szegö’s paper. On Certain Hermitean Forms Associated with the _ 
Fourier Series of a Positive Function, Vol. 1. Gábor Szegö’s Collected Papers. Birkhauser, Boston, 
1981. Sie 


Noether, F. Uber eine Klasse singularen Integralgleichungen. Math. Ann., 82 (1921): 42-63. 


Osher, S. Systems of difference equations with general homogeneous boundary conditions. Trans. AMS, 
137 (1969): 177-201. *: : 


Parter, S. V. On the eigenvalues.of-certain-generalizations-of-Toeplitz matrices.-Arch.-Rat.-Mech:-Anal. 41- 


Sarason, D. Toeplitz operators with piecewise quasicontinuous symbols. /ndiana U. Math. J., 26 (1977): 
817-838. 


Strang, G. Toeplitz operators in a quarter plane. Bull. AMS, 76 (1970):.1303-1307. 


“-Yood, B. Properties of linear transformations preserved-under-addition of-a completely continuous trans: 


formation. Duke Math. J., 18 (1951): 599-612. ` 


COMPACT SYMMETRIC 
OPERATORS IN 
HILBERT SPACE 


` One of the most beautiful—as well as the most useful—results of linear algebra is 
the spectral theory of hermitean symmetric matrices. We recall that a matrix A is 
hermitean symmetric if it is its own adjoint: 


A* =A, 


The spectral theory of such matrices says that A has a complete set of orthogonal 
eigenvectors, and that the corresponding eigenvalues are real. This result has a perfect 
generalization, due to Hilbert, to hermitean symmetric operators in a Hilbert space 
that are compact. In this chapter we present this generalized theory, and give some 
concrete applications. 


© Definition. An operator A mapping a complex Hilbert space H into itself is called 
hermitean symmetric—symmetric for short—if it is its own adjoint, that is, if for all 
x and yin H 


SRR Ae «Gees A S (1) 


Exercise 1. Show that a symmetric operator A as above is closed. Show that A is 
bounded. 


~ Theoren 1. Ler A bé a symmetric operator: 


(i) The (hermitean) quadratic form (Ax, x) is real for all x in H. 
(ii) The quadratic form (2) is not identically zero unless the operator A = 0. 


— Proof- Sety = 7 in (1): 
(Ax, x) = (x, Ax). (2) 


316 COMPACT SYMMETRIC OPERATORS IN HILBERT SPACE 


Since the scalar product in a complex Hilbert space is skew symmetric, (2) is real. If 


i (Ax, x) =O for all x, then by setting x Œy in place of x we deduce that the bilinear 
form- (Ax, y= = 0 for all-x and yin H. It follows that Ax = 0 for all x. O 


Definition. A symmetric operator K mapping a Hilbert space H into itself is called _ 
_Positive if the associated quadratic form (Kx, x) i is nonnegative for every xin iH. X 


~ This’ is denoted as 0 < K- Daa oe R ee = 


Definition. Let A and B denote two symmetric operators mapping a Hilbert space 
H into itself. The inequality . A: < B (A less than B, B greater than A) means that 
0< B= AL ices 


Exercise 2. Show that the sum of two positive operator is positive. 
Exercise 3. Show that if A < B; and C < D, then A +C < B; +D. 


Strict positivity, and strict inequality are defined analogously. 

In this chapter we will study the spectral theory of compact symmetric operators. 
We recall from chapter 21 that an operator A : H — H is called compact if it maps 
the unit ball in H into a compact set, that is, if the set {Ax, ||x]| < 1} is precompact. 
We recall for convenience the notion of precompact; a set is precompact if its closure 
is compact. In a metric space this-ean be expressed in two equivalent ways: 


Definition I. A subset R of a metric space is precompact if every sequence {zn} of 
vectors in R contains a convergent subsequence. 


Definition II. A subset R is precompact if for any € > 0 the set R can be covered 
by a finite number of balls of radius e. 


Theorem 2. A compact-symmetric operator A maps a weakly convergent sequence 
{xn} into a sequence {Axp} that is strongly convergent. 


Proof. It follows from (1) that if {xn} converges weakly, also {Ax,} converges 
weakly. A weakly: convergent sequence is-uniformly bounded: ||x, || < const. It 
follows from the definition of a compact operator that {Ax,} lies in a precompact 
set; so by definition I a subsequence of {Ax,} converges strongly to some limit z. 
We claim that the whole sequence converges to z; for it not, there would be some e- 
ball centered at z such that infinitely many Ax, lie outside it. By precompactness, this 
sequence has a subsequence that converges strongly to a limit z’; clearly, ||z’—z|| > €. 
But that contradicts the weak convergence of the whole sequence {Ax,}. o 


We state now the main result of this chapter, the spectral theorem: 


Theorem 3. A denotes a compact symmetric operator mapping a complex Hilbert 
space H into itself. Then there is an orthonormal base {zn} for H consisting of 


COMPACT SYMMETRIC OPERATORS IN HILBERT SPACE 317 


eigenvector of A: 
AZn = On Zn. (3) 


The eigenvalues a, are real and their only point of accumulation is 0. 


Proof. If A is the zero operator, any orthonormal basis will do. Suppose that A = 
0. By theorem 2, the quadratic form (Ax, x) Æ 0, so it takes on some nonzero value, 
say positive. Look at the values of the quadratic form on the unit sphere ||.x|| = 1. 
Since A is a bounded operator, (Ax, x) does not exceed ||A|] on the unit sphere; we 
claim that it achieves its maximum there. To see this, denote by m its supremum: 


sup (Ax, x) = m; (4) 
cll=1 


m is a positive quantity. Let {x„} be a maximizing sequence. Since the unit ball in a 
Hilbert space is weakly sequentially compact (see theorem 7 in chapter 10), a subse- 
quence, also denoted as (x,}, converges weakly to a limit we denote as z. We claim 

` that z solves our maximum problem, for, by theorem 2, since A is compact, Ax, 
converges strongly to Az. It follows from this that (Ax,, xn) converges to (Az, z). 
Since {x,,} is a maximizing sequence, the former converges to m. Therefore 


(Az,z) =m. (4) 


To show that z maximizes (4) we have to verify that z is a unit vector. Since z is a 
weak limit of unit vector, its norm is < | according to theorem 5 in chapter 10. Since 
m is positive, (4’) shows that z 4 0. Now define y = z/|z|; it is a unit vector, and 


z 
4 


(Az, z) om 


izl? el?” 


(Ay, y) = 


If [zi] were less than 1, (Ay, y) would be greater than m, contradicting the definition 
(4) of m as the supremum. 
The homogeneous function 


(Ate ete Baia a na 
lx? 


is called the Rayleigh quotient. Clearly, the vector z maximizes Ra(z) among all 
—_____-nonzero-vectors, not just-unit-vectors,Letw be-any_vectorin-H, t any real number___ 
_....The.function.R(z.+:.tw)_as_function_of.t.achieves_its. maximum at.t =0; therefore... 
by calculus its t-derivative is zero there. Differentiating (5) yields 


Ra(x) = 


(Aw, z) + (Az, w) (w, z) + (z, w) 
= — (Az, z) ——— ——- = 0, 
izli? ` ; lzl4 


from which, using thé symrietty (1yof A and (4’), we get Pied 


Re(Az — mz, w) = 0. 


Pazi 


318 COMPACT SYMMETRIC OPERATORS IN HILBERT SPACE 


Since w is an arbitrary vector, Az — mz = 0 i.e. z.is an eigenvector of A, with 


-——— eigenvalue e 


aa PDECAUSC- mammae manii or e eien ct per sions eras, te 


Once we have proved the existence of one eigenvector, we can deduce the exis- 
tence of a complete set. This is based on the observation that for a symmetric opera- 
tor A, the orthogonal complement of an eigenvector is invariant under A. To see this, 

suppose that z is an SlBenvEcOn; and y is orthoganal to z; then Ay is orthogonal to z, 


(Ay, z) = (y, Az) = (y, àz) = 


It follows that if y is orthogonal to a collection {z,,,} of eigenvector, so is Ay. 


Now take {zm} to be the collection of al] eigenvectors of A; denote by Y their 
orthogonal complement. As shown above A maps Y into itself, and of course, A 
restricted to Y is symmetric. Therefore, as shown above, unless Y consists of the zero 
vector, it would contain an eigenvector of A, a contradiction since Y is orthogonal to 
all eigenvector. E 


Exercise 4. Show, using theorem 2, that the sequence of eigenvalues tend to zero. 


Exercise 5. Show that if w — lim x, = x, and if lim ||x,|| = ||x||, then x, converges 
strongly to x. 

The argument used to prove the existence of the first eigenvector was constructive. 
The same constructive argument can be used to furnish all subsequent eigenvectors. 
Arrange the positive eigenvalues of A in decreasing order: 


AZn = Qn Zn, a) Za >Q. ; (6) 
Then 
(Ax, x) ; 
an = arr Es (6) 
vE iene We 


The vector that maximizes (6’) is the Nth eigenvector. The negative eigenvalues of 
A can be characterized by similar minimum problems. 

Suppose, as is often the case, that we are interested in the eigenvalues rather than 
the eigenvectors; then formula (6’) is not so useful, for this maximum problem in- 
volves explicitly the unwanted eigenvectors. Fortunately there are formulas, two dis- 
tinct ones, one due to E. Fischer, the other to R. Courant, that characterize the Nth 
eigenvalue of A without reference to eigenvectors of previous eigenvalues. 


Theorem 4. Let A be a compact symmetric operator; denote its positive eigenval- 
ues, indexed in decreasing order, by a, k = 1,2,...; see (6). Denote by Ra (x) its 
Rayleigh quotient, defined by (5). 


COMPACT SYMMETRIC OPERATORS IN HILBERT SPACE l 319 
(i) Fischer’s principle: 
ay = max min R(x), (7) 
Sy xeSy 


where Sy is any linear subspace of H of dimension N. 
(ii) Courant’s principle: 


= R 
ay = mi aa a(x). (8) 


Proof. (i) Since Sy is N dimensional, it contains a nonzero vector y satisfying 
the N — 1 linear conditions (y, z) = 0, k = 1,..., N — 1. For such a vector y it 
follows from (6’) that Ra (y) < ay. Since y belongs to Sy, it follows that 


in Ra(x) < ay. 
an A(x) S ay (9) 


Inequality (9) holds for any subspaces of dimension N. On the other hand, if we 
take Sy to be the space spanned by the eigenvector z;,...z,, the minimum of the 
Rayleigh quotient on Sy, reached for x = zn, is ay. This proves (7). 

(ii) Given any subspace Sy_ , of dimension N — 1, the N-dimensional space 
spanned by the first N eigenvectors contains a vector y that is perpendicular to Sy_y. 
Since for every vector y in the span of the first N eigenvectors Ra(y) > ay, it fol- 
lows that for every subspace Sy_, of dimension N — 1, 


ax Ra(x) > ; 
Asi A(x) Z aN (10) 


On the other hand, if we take Sy_ to be the space spanned by z4, ...zZŅ—1, then 
according to (6’) the sign of equality holds in (10). This proves (8). o 


Similar pairs of variational principles hold for the negative eigenvalues. 

In a finite-dimensional space (7) and (8) hold for all eigenvalues, positive and 
negative. In this case the two principles are equivalent; (7) applied to —A gives (8). 

In an infinite-dimensional Hilbert space (7) and (8) are distinct; (7) can be used to 


give lower bounds for the Nth positive eigenvalue, whereas (8) can be used to give 
upper bounds. 


--Theorem-5.--Let-A-and-B-denote two-compact-symmetric-operator:--A-<B-Denote 


their positive eigenvalues, indexed in decreasing order by ap and Bpk =-\5-25-. ripere 


ere 


respectively. We claim that the kth eigenvalue of A is spud than or equal the corre- 
sponding eigenvalue of B: 


ak < Br. (11) 


Proof. It follows from the definition of inequality for symmetric operators that 
A < B means that (Ax, x) < (Bx, x) for all x. Then also Ra(x) < Rp(x) for 


320 : $ COMPACT SYMMETRIC OPERATORS IN HILBERT SPACE 


all x; the conclusion now follows from either Fisher’s principle (7) or Courant’s 
___ principle (8). i o 


For negative eigenvalues the opposite inequality holds. 


Exercise.6... Show.that.a compact positive.symmetric operator has no negative eigen- 
values. Sa, OUPA a 


In chapter 17 we have developed a functional calculus for elements of a Banach 
algebra, in particular, for bounded operators mapping a Banach space into itself. The 
class of functions f for which we were able to define f(A) consisted of the set of 

——___functions_analytic_on-an-open_set-containing the-spectrum_of.A..We show.now that.. 
thanks to the spectral theory developed in this chapter, we can for A symmetric and 


compact, define f(A) for every f defined on the spectrum of A. 


Theorem 6. Let A be a compact symmetric operator. We can assign to every 
bounded complex-valued function f(a) defined on the spectrum of A an opera- 
tor that we denote as f (A), so that 


(i) the operator assigned to the function f (o) = 1 is the identity I. 
(ii) the operator assigned to the identity function f(a) =o is A. 


(iii) the assignment f — f(A) is a isomorphism of the ring of bounded function 
on o (A) into the algebra of bounded maps of H into H. 


(iv) this isomorphism is isometric: 


IFAT = sup | f(o)]. 
o in o(A) 


(v) when f is real valued, f (A) is symmetric. 
(vi) when f is positive on the spectrum of A, so is f (A). 


~ Proof. The proof is shorter than the statement of the theorem. Denote by {zn} an _ 
orthonormal basis consisting of eigenvectors of A, with eigenvalue a,,. Express x in 
H in terms of this basis: 

x = J Git (12) 


and define f(A) to act as follows: 


f(A)x = > F(Qn)enZn. (13) 


Properties (i) to (vi) are now obvious. o 


Corollary 1. Suppose that the operator A is positive; then the spectrum of A lies 
on the nonnegative reals. Therefore f(A) = <2 is real and can be chosen positive 
on the spectrum of A; VA, called the positive square root of A, is symmetric and 
positive. 


COMPACT SYMMETRIC OPERATORS IN HILBERT SPACE 321 


Exercise 7. Prove that the positive square root of A is unique, i.e. that there is no 
other positive operator whose square is A. 


Here is a no-cost extension of the spectral theorem for compact symmetric oper- 
ators: 


Theorem 7. Let {Ay} be a collection of symmetric operators mapping a Hilbert 
space H into itself that commute pairwise: Ay As = AgAy. Suppose that at least one 
of the Ay is compact; then there an orthonormal basis {zn} consisting of common 
eigenvectors of all the Ay: 


Ay Zn = Qn (y) Zn. (14) 


Proof. Denote by A one of the compact operators in the collection, by a, its 
eigenvalues. Denote by S, the eigenspace corresponding to œp, that is, the space 
of vectors z satisfying 


Az = Anz. (14’) 


It follows from theorem 3 that each S,, is finite dimensional, that they are orthogonal, 
and that they span H: 


H=S;@5@---. 


Each S,, is invariant under all the other Ay. To see this, let Ay act on (14’); using 
commutativity, we get 


Ay Az = AAy z = ay Aye. 


Restricted to Sn, Ay is a symmetric operator; therefore S, is the orthogonal sum 
of eigenspaces of Ay. We take now another operator A,, of the collection and de- 
compose each of these common eigenspaces of A and Ay into an orthogonal sum 
of eigenspaces of A,,, and so on. Since: Sn is finite dimensional, this process must 
come to an end after a finite number of steps with a decomposition of 5, into a sum 
of eigenspaces for all operators in the collection. Then we turn to S,.) and repeat 
the process, and so on. 0 


Theorem 7 has the following important application to normal operators: 


Definition. An operator N mapping a Hilbert space H into itself is called normal if 


_. -— N and its.adjoint commute: 


N*N = NN*. 


Exercise-8._Show,-without-_recourse_to_theorem-3,.that.eigenvectors.of.a.symmetric....... 
. operator belonging to distinct eigenvalues are orthogonal. 


i 


322 COMPACT SYMMETRIC OPERATORS IN HILBERT SPACE 


Corollary 2. Every compact normal operator has a complete set of orthonormal 
eigenvectors. , 


~~" Proof. Decompose N intothe sum of its symmetric and antisymmetric parts: 


N+N* . N- 
hg Te 


N=R+J,  whereR = 


a Clearly, Ris symmetric, J antisymmetric -and N — R—J_Since N and Në commute,____.. 
so do R and J. According to Schauder’s theorem, theorem 7 in chapter 21, the adjoint 
N* of the compact operator N is compact; therefore so are J and R. 
.. We appeal now-to theorem 7.and.conclude that R and J have a complete orthonor- 
=" —- Ma-SEL-Of-COmmMon-elgenvectors; clearly, these are eigenvectors of N as well. E 


Definition. An operator U is called unitary if it maps H onto itself isometrically, 
that is, ||Ux |] = |ix{l. 


Exercise 9. Show that a unitary map U satisfies U*U = I. 


Exercise 10. Let U be a unitary operator of form I+ C, C compact. Show that U has 
a complete set of orthonormal eigenvectors, and that all eigenvalues have absolute 
value 1. 


BIBLIOGRAPHY 


Courant, R. Uber die Eigenwerte bei den Differenzialgleichungen der Mathematischen Physik. Math. 
Zeitschr., 7 (1920): 1-57. 


Fischer, E. Uber quadratische Formen mit reellen Koefficienten. Monatshefte Math. Phys., 16 (1905): 
234-249, 


Hilbert D. Grundzüge einer allgemeinen Theorie der linearen Integralgleichungen. Nachr. Akad. Wiss. 
Göttingen. Math.—Phys. Kl (1906): 157-227. 


29 


EXAMPLES OF COMPACT 
SYMMETRIC OPERATORS 


The spectral theory developed in the previous chapter is one of the workhorses of 
analysis. In this chapter we present a few examples. 

29.1 CONVOLUTION 

We take the unit circle S! and form the Hilbert space H = L?(S!), namely the square 


integrable functions. Let a denote any complex-valued function of class L'(S!), and 
define the operator A : H + H as convolution with a: 


(Au) (x) = I a(y) u(x — y) dy. (1) 
ANG 
By a change of variables we can also write 


(Au)(x) = Í a(x — y)u9)dy. 


Theorem 1. 


(i) Every convolution operator commutes with translation: 


(Teu)(x) = u(x + c). 


~ (ii) Any two convolution operator commute. 


i Exercise 1. Prove theorem 1. 


Theorem 2. 


(i) A defined iri (Ty is a bounded operator, and 
IAI < lalz:. (2) 


324 k EXAMPLES OF COMPACT SYMMETRIC OPERATORS 


(ii) A is a compact operator. ` 
(iii) A is a normal operator-————+ 
(iv) Ifa satisfies. 2 0 0 aae er Sake renee es ee 


a(—x) = a(x), E O) 


then A is symmetric. ee Peer eee tae 


~~ "Proof. (i) Approximate the integral oi the right of (1) by a sum 
(Au)(x) xh Xan h)yu(x — nh). 


h >p ja(n W) ieil. 


From this inequality (2) follows for smooth a and u by letting h tend to zero. For u 
in L? anda in L! we use approximation by smooth functions. 

(ii) Suppose that a is in L?(S'); then we can use the Schwarz inequality to estimate 
the modulus of continuity of Aw: 


-Using the triangle inequality,we-get that the-right side is- bounded-in-norm-by —— 


(Au)(x) — (Au) (@) = J E E MiG 
saye 
< | / oe 9) E E ay lu 


1/2 
= ll la(y) —a(y +z — x)|? ay] ale 


As z — x tends to zero, the integral on the right tends to zero; this proves that the 
image of the L? unit ball under A forms a set of equicontinuous, uniformly bounded 
set of functions. According to Arzela-Ascoli such a set of functions is precompact 
in the maximum norm. Therefore this set is perforce precompact in the coarser L?- 
norm. 

Any L! function a can be approximated in the L!-norm by a sequence of L? 
functions. It follows from (2) that the corresponding operator A, approximate A 
in the operator norm. Since the uniform limit of compact operators is compact, (ii) 
follows. 

(iii) Let u and v be any pair of L? functions. Multiply (1) by v(x) and integrate 
over si. we get 


(Au, v) = // a(x — y) u(y) D(x) dy dx. (4) 


Interchange the names of the integration variables given for the integral on the right 
of (4), 


I a(y — x) Dy) u(x) dy dx = (u, A*v), 


CONVOLUTION ; 325 
where A* is the operation of convolving with the function a* defined as 

a™(x) = @(—x). (5) 
Therefore the adjoint A* of A too is a convolution operator. Since convolutions com- 


mute, A is a normal operator. 
(iv) If a satisfies (3), then a* = a and so A* =A. Oo 


According to theorem 7 of chapter 28, a collection of normal operators that com- 
mute pairwise and one of which is compact have an orthonormal basis consisting of 
common eigenvectors. Take the collection consisting of all convolutions A and all 
translations T; we showed above that the hypotheses of theorem 7, chapter 28 are 
fulfilled; therefore there exists an orthonormal basis {ep} of L7(S!) such that each 
e = ex satisfies 


a*xe=ae, e(x+c)=t(c)e(x). (6) 


We have seen before that e(x + c) depends continuously on c in the L?-norm. It 
follows that the eigenvalue z (c) defined in (6) is a continuous function of c. Clearl y, 
t(c) # 0, for otherwise e(x + c) would be zero for all x, impossible for an eigen- 
Aiaction: 

Interchange the role of x and c in (6): 


e(x +c) = t(c) e(x) = t(x) e(c). 
Dividing by t(c) t(x), we get 


etx) _ eo) 
t(x) t(c) 


This shows that e = const. T; renorming e, we can make that constant = 1. So 
equation (6) can be rewritten as 


e(x +c) = e(c) e(x). (7) 

The function e, being equal to the function t, is continuous. It is well known that the 
Se eon eee orionena mncnnn a 
~---Since.e-is.continuous.on-S!,-itis. periodic, so- ~~ -~-—- 


ex(x) = ef, 


k integer, ` (8) 
This completes the is that the exponentials (8) forma complete orthogonal system 

— on SŁ, elai 2$ setae Sa co ees 
There are of course Shipler ways of proving that the exponentials (8) are com- 
plete, but the proof presented above has the virtue that it can be generalized from S! 


“a 


--almost periodic functions: For-details, see the-F. Riesz and Sz.-Nagy text, Functional. 


326 EXAMPLES OF COMPACT SYMMETRIC OPERATORS 


to other compact commutative group. Thus in precisely this fashion Hermann Weyl 


succeeded in proving that the exponentials e' $~, E real, are complete in the space of ~~ 


Analysis. 


29.2 THE INVERSE OF A DIFFERENTIAL OPERATOR E ea 


Denote by L the second-order differential operator 


SE z d M $ i a 4 4 
ee ee oe Ss —=-— 8) 


acting on functions on the interval [0, 277] and vanishing at the endpoints. Here q is 
a continuous real valued function, say bounded from below by 1: 


q(x) 21. (10) 
It is not hard to show (e.g., see chapter 7) that the boundary value problem 
Lu = f, u(0)=0, uQxr)=0, (11) 


has a unique solution u for any given continuous function f. We denote the depen- 
dence of this solution u on f by A: 


in words, A is the inverse of L. We will show below that A as defined above is 
bounded in the L?-(0, 27r) norm, and thus can be extended by continuity to the whole 
Hilbert space H = L?. 


Theorem 3. The operator A as defined above is 


(i) bounded. 
(ii) compact. 
(iii) symmetric. 
(iv) positive with respect to the L? scalar product. 


Proof. For u twice continuously differentiable, multiply equation (11) by u and 
integrate over (0, 27r]. Using definition (9) of L, integration by parts gives 


[ (8 +a) dx = fufas. (13) 


We estimate the right side by Schwarz’ inequality, and use the arithmetic geometric 
mean inequality and restriction (10) on the left; we get 


THE INVERSE OF PARTIAL DIFFERENTIAL OPERATORS 327 


1 1 
[tty faces f Pas. 


The boundedness of A follows from this. It also follows that A maps the unit ball in 
the L? norm into a set of functions u for which lux? < 1. According to Rellich’s 
criterion (see chapter 22) this set is precompact; this proves that A is a compact 
operator. 

(iii) The symmetry of A follows from that of L: for u, v twice differentiable, we 
get by integration by parts that 


(Lu, v) = (u, Lv). 
Setting Lu = f, Lv = g, we get l , 
(fF, A8) = (A f, 8). 
(iv) Since the right side of (13) is (A F, f), positivity of A follows. 
Since A as defined for bounded f is a bounded operator in the L?-norm, it can be 


extended by continuity to all f in L?. The extended operator retains all properties 
listed in theorem 3. g 


Exercise 2. Show that for all f in L?, A f is continuous and is zero at 0 and x. 
(Hint: Use the estimate |lux || < 41) 11.) 


We can now apply the main result of chapter 28 to H = L7(0, 27r) and conclude 
that the operator A has a complete set of orthonormal eigenfunctions {e,} in L?: 


Å En = Qn en- (14) 
Since A is positive, so are its eigenvalues. 
Using the result of exercise 2, we conclude from (14) that each e, is continuous. 
For such e, L is the inverse of A, so we apply L to (14) to conclude that 
Len =Anen, An = az.. (14) 


The eigenvalues œn of A tend to zero; it follows from (14)’ that the eigenvalues 
Àn- of L tend to infinity. Ao: NP E AE MEAE eta LAE EA 


29.3 THE INVERSE OF PARTIAL DIFFERENTIAL OPERATORS 


The analysis in the last section can be applied, with a few changes, to partial differ- - 


ential operators. The simplest case is 


L=—A, (15) 


328 EXAMPLES OF COMPACT SYMMETRIC OPERATORS 


“the Laplace opriet A= y J saps = 3/ðx;. Lal G be a domain in R”, and consider 
the boundary value problem 


Lu=finG, u= 00n the boundary f G. — 06 ~~ 


It is a basic result of the theory of partial differential equations (see chapter 7) that for 

J sufficiently. differentiable and G smoothly bounded, this boundary value problem... 

a unique solution u. As before, we denote this solution u as A f. The analogue 
of theorem 3 holds in this case too and is proved by the same technique, namely 

By integration by parts. This allows the extension of A by continuity to all of H = 

L?(G), and the application of the spectral theorem of chapter 28 to conclude that A 


~ --hasacomplete orthonormal set of eigenfunctions u,;with positive eigenvalues, 


The argument that was used to prove that A is a compact operator shows that A 
is a smoothing operator. A not too technical argument shows that for a large enough 
k, Af is highly smoothing, that is, it turns any L?(G) function into one that is as of- 
ten differentiable as we wish. Applying the operator A many times to the eigenvalue 
equation Ae = we gives A‘ e = a e. It follows that the eigenfunction e is suffi- 
ciently differentiable; therefore LAe = e. Applying L to the eigenvalue equation 
gives therefore 

=i 
Len = Àn €n, Àn = Qn 
meaning that the ep aré eigenfunctions of the differential operator L as well. 


This setup is applicable to more general elliptic operators. For instance, we can 
set 


L=-) da; aj +g, 


where (aij) is a positive definite matrix whose entries are smooth functions of x and 
g is a smooth nonnegative function. The boundary condition u = 0 on the bound- 
ary may.be replaced by others, for instance, the Neumann condition that the normal 
derivative of u vanish on the boundary. But note that whereas for the boundary con- 
dition u = 0 smoothness of the boundary may be relinquished, it cannot be for the 
Neumann condition. 

The Fischer and Courant variational principals for the eigenvalues of A can be 
translated into variational principles for the eigenvalues of L. They yield excellent 
asymptotic estimates for the size of the nth eigenvalue; see Hermann Weyl, and 
Courant quoted in chapter 28. 


BIBLIOGRAPHY 


Weyl, H. Uber die asymptotische Verteilung der Eigenwerte. Gértinger Nachr. (1911): 110-117. 


A A 
h E 
i & 
a # 


TRACE CLASS AND 
TRACE FORMULA 


A remarkable result of linear algebra is the trace formula, which says that the sum 
of the eigenvalues of a square matrix equals the trace of the matrix defined as the 
sum of its diagonal elements. In 1959 Lidskii showed that this relation is valid also 
for a large class of compact operator in Hilbert space. This result is rather deep, and 
its proof correspondingly tricky. Lidskii’s trace formula is a powerful tool in many 
branches of analysis. 


30.1 POLAR DECOMPOSITION AND SINGULAR VALUES 


Let H be a separable Hilbert space over C, and T some compact operator mapping H 
into itself. Denote the adjoint of T as T*; the product T*T is clearly a nonnegative, 
symmetric operator; according to the functional calculus described in chapter 28, 
T*T has a uniquely determined positive square root A = (T*T)!/*. For any u in H, 


Tol]? = (Tu, Tu) = (u, T*Tu) = (u, A?u) = (Au, Au) = Aul. 0) 


As we apply (1) to u — v, we deduce that if Au = Av, then Tu = Tv. This enables 


_____us_to.define the operator U_on the range of A as follows: 


U: Au > Tu. (2) 


It further follows from (1) that U is an isometry on the range of A. 


Un =0 forn LR. 


Since (Un, v) = (n, U*v) = 0 for nLR, and for all v, it follows that U* maps H 
into the orthogonal complement of Rt; thus the range of U* lies in the closure of R 


of R. We claim that 


U*Uw = w for w in R. (3) 


Denote the range of A by R; and define U to be zero on the orthogonal comple ~~ 
=—==ement of Rr e as 


330 , TRACE CLASS AND TRACE FORMULA 


To see this take any z and w in R. Since U is an isometry on R, it also preserves the 
scalar product of any two elements in R: 


(z, w) = (Uz, Uw) = (z, U*Uw). 


„---Itfollows that (z, U*Uw — w) = 0. Since z is an arbitrary elements of R, 


U*Uw — wLR. 


On the other hand, we have shown that U* maps H into R. So we see that for w in 
R, U*Uw — w both belongs to R and is perpendicular to it. Therefore (3) follows. 
We summarize the information contained in (2) and (3): _ ney etna 


Theorem 1. Every compact operator T can be factored as 
T=UA, (2') 
where A is a positive symmetric operator, and U*U = I on the range of A. 


The operator A is called the absolute value of T, and (2’) is called the polar 
decomposition of T. 

Theorem 1 is true not only for compact operators but all bounded operator. The 
only place in the proof where compactness was used is the construction of the square 
root of T*T. As we will show in the next chapter, every bounded positive symmetric 
operator has a square root, not just the compact ones. 

When T is compact, so is its absolute value A. The nonzero eigenvalues of A, 
denotes as {s j} are positive numbers that tend to zero; we index them in decreasing 
order. The numbers s; are called the singular values of the operator T, and denoted 
as s; (T). 


Exercise 1. Show that for each j, s j(T) is a continuous function of T in the norm 
topology. 


30.2 TRACE CLASS, TRACE NORM, AND TRACE 
Definition. A compact map T of a Hilbert space H into H is in trace class when 
co 
es: j(T) < œ. (4) 
1 


This sum (4) is called the trace norm of T: 
(Tle = sj). (4) 


Exercise 2. Show that |T} < IIT Ilr. 


TRACE CLASS, TRACE NORM, AND TRACE .. 331 


The next theorem enumerates the basic properties of the trace norm: 


Theorem 2. Let T be a trace class operator, B any bounded operator. Then 


(i) Tle = IT" thee 
(ii) BU te < IBI Ter 
(iii) WTB] < IBI Tle. 
(iv) For any pair of trace class operators T and S, T +S is trace class, and 


WT + Sle < The + US. 


In words, the trace class is closed under adjointness, and is a two-sided- ideal in 
the algebra of all bounded operators. The trace norm satisfies the triangle inequality. 


Proof. (i) We will show that s;(I*) = s;(T). The singular values of T* are the 
positive eigenvalues of the square root of T**T* = TT*. We claim that TT* and 
T*T have the same positive eigenvalues. To see this, let z be an eigenvector, A an 
eigenvalue of T*T: 


T*Tz = dz, 4 #0. 
Let T act on both sides: 
TT*Tz =ATz, 


which shows that A is an eigenvalue of TT*, with eigenvector Tz; Tz is not zero 
because A Æ 0. Since A = (T*T)!/2, the eigenvectors of A are those of T*T, and 
the eigenvalues the square root of those of T*T. This proves that s;(T) = s;(T*), 
and completes the proof of (i). 


Exercise 3. Give an example of a bounded mapping such that T*T has zero for an 
eigenvalue but TT* does not. 


(ii) and (iii): We will show that s;(T) < ||Bl| s; (T). To deduce this, we will verify 


_that the absolute value square of BT is léss than ||B||? times the absolute value square 


of T. Clearly, the associated quadratic forms satisfy the inequality 


(T*B* BT u, u) = |BTul? < IBI? [|Tv] = BI? (T*Tw, u); 


this 1s the meaning of pe A 


` (BT)* BT < |B|? T*T. 


According to theorem 5 of chapter 28, the jth eigenvalue is a monotonic function, 
so 


— -— —-$7(BT)-<. BI? s7(T). (5) 


Taking the square root and summing over j, we obtain inequality (ii). 


332 TRACE CLASS AND TRACE FORMULA 


Since the singular values of adjoint operators are the same, we deduce from (5) 
that 


sj (TB) = 5;(B*T*) < IIB* s; (T*) = IBI sj (T). (5°) 


Summing over j we deduce (iii). 
To prove (iv), -we- establish the- pollowang CHaracteriealion: of the trace class and 
trace norm: 


ee O 


where the supremum is taken over all pairs of orthonormal bases { f,} and {en}. 


We-have-to-show—that _the-right-side_of never-exceeds—|'T},-,-and-equals-it-for——- - 
appropriate choice of fy and e,. 
Denote by z; the normalized denei of the absolute value A: 


Azj=sjzj  |lzjl|=1. 
For any vector f, we can expand, 
Í =J C. zj)zj, Af= LSE, zj) zj. 
ji J 
Apply U to both sides; using the polar decomposition T = U A, we get 
T= 5 sj(f.2)) wy, (7) 


where w; = U zj. According to theorem 1, wj form an orthogonal basis of the range 
of A. We take the scalar product of (7) with e: 


(T fie) =) A (7/) 


We set now f = fn, € = en and sum over n: 


Yi fmen) = Y sja zwen) (8) 


We claim that the double series on the right converges absolutely and is < ||T Ilr. To 
see this, we sum first with respect to n and apply the Schwarz inequality; we get the 
following estimate for the sum on the right in (8): 


1/2 
si (lun <P Dewy ent 
J n n 
By the Parseval relation 
Dl sas Y wew = w = 1, 
n n 


This shows that the right side of (8) is bounded by J` sj = Tle. 


TRACE CLASS, TRACE NORM, AND TRACE 333 | 


To complete the proof, we choose fn = Zn, €n = Wy, supplemented by an arbitrary 
orthonormal basis on the orthogonal complement of the range of A. Setting fı = 
en = Wn, in (7°) we get, since U is an isometry on the range of A, that 


Sats 


(T fns en) = (UAzn, Uzn) = Sp. 


Summing over n, we get equality in (6). 

The right side of formula (6) is a supremum of a sum of absolute value of linear 
function of T. Therefore it is a subadditive function of T; it follows that if it is finite 
for S and T, it is finite for S + T and satisfies the triangle inequality. 0 


Exercise 4. Show that the trace class operators form of a complete linear space under 
the trace norm. 


A bounded operator T in Hilbert space can be represented as an infinite ma- 
trix with respect to any orthonormal basis {fa}. The mth element of this matrix 
is (T fy. fm). Therefore the trace of this matrix is 


DOT fa. fa), (9) 


provided that this series converges. 


Theorem 3. For every trace class operator T the series (9) converges absolutely to 
a limit that is independent of the orthonormal basis chosen. It is called the trace of 
T, and is denoted as tr T. 


Proof. Set in (8) en = fn, to obtain 


Tht =) S tuesp a (10) 


As we have already shown, the double series on the right converges, and its value is 
< Ther. | 

To show that the trace is independent of choice of the orthonormal basis, we sum 
(10) first with respect to z. Using the Parseval relation 


Sine D, fa) = (w, 2) $ ERT a 


we can write (10) as 


E Re ae te P= S52 me ee -~ 
which is clearly basis independent. ee ge we | ee ue “ie 


We state now some of the basic properties of trace. 


Theorem 4. Let T be a trace class operator. 


(i) {wT < Tle 


334 TRACE CLASS AND TRACE FORMULA 


(ii) tr Tis a linear function of T. 
(iii) trT* = trT. 
(iv) For any bounded operator B, ttTB =trBT. 


: Proof. Inequality (i) was derived in the course of proving the convergence of (8). 
“== Properties (ii) and (iii) follow from the definition (9) of trace. To prove (iv), we start 
with formula (7),-and let B act on both side: ` 


BT f=) sj(f,zj) Buy. 


so 


CTAN =) aB wj, fy. 
Set f = fn and sum with respect to n. Reversing the order of summation and using 
the Parseval relation as in the derivation of (11), we get 


BT = ) ‘s;(T)Bwj, zj). . " (12) 
On the other hand, replacing f by B f in (7) gives 
TBf =) )s)(T)Bf, zs) wy =) 5;(T)(F, BY z)) wj. 
j 


Proceeding as before, we get = 
trTB = Ys Tw, B*z;)= > sj T)B Wy, Zj); 


which is the same as formula (12) for tr BT. g 


30.3 THE TRACE FORMULA 


The deepest, and most important, property of the trace was proved by Lidskii in 
1959: 


Theorem 5. The trace of a trace class operator is the sum of its eigenvalues: 


rT =} jT). (13) 


The identity (13) is called the trace formula. 


Proof. When T is a normal operator of trace class, we can, according to corollary 7 
of chapter 28, choose an orthonormal basis consisting of eigenvectors of T. By (9), 


tr T = 5T fns fn) = > Ans 


proving (13). 


——-another- generalized eigenvector, and suppose that 1 is orthogonal toe and f Wen ==7 


1 


THE TRACE FORMULA 335 


For T not normal, the eigenvector are in general not orthogonal, and there may be 
generalized eigenvectors: 


Twp =ànw or Twn =An Wnr + Wr- 


We can, by the Gram-Schmidt process, orthonormalize them; fn is a linear combi- 
nation of w], ..., Wn so that 


T fa = Àn fa + linear combination of f,,---, fa—1- 
Since the fa have been chosen to be orthonormal, 


(T fas fn) = Àn. 


Summing over all n would yield the trace formula, provided that the f, form a basis 
for the whole Hilbert space. They do if the eigenvector and generalized eigenvector 
span the whole space, but if they don’t, then the fa have to be supplemented by an 
orthonormal basis /, for the orthogonal complement of the eigenvector of T. The 
expression of the trace of T now reads 


tT =) 0 fas fn) + > (Thm. hm) = > Ant) (Thm tim). UH 


The task is to show that the second sum on the right is zero. For this we need some 
lemmas. 


Lemma 6. Let T be a compact operator on a Hilbert space H, K the orthogonal 
complement of its eigenvectors and generalized eigenvectors. 


(i) K is an invariant subspace of T*. 
(ii) The spectrum of T* over K consist of the single point à = 0. 


Proof. (i) Let e be an eigenvector, possibly generalized, of T: 


Te=Ae+f, 


claim that so is T*u; for 
©. Tw = (Te, u) = (àe + f, u) = Ale, u) + (f, u) = 0. 


—~ (ii) According: to~Schauder’s: theorem, the-adjoint T* of-a compact operator is 
compact. If À were a nonzero eigenvalue of T* on K, then À would be an eigenvalue 
of T in H of finite multiplicity. According to theorem 6 of chapter 21, there is an 
integer i such that the nullspace of (T* — A)! equals the nullspace of (T* — })!+!, 


but is larger than the nullspace of (T* — 4)'~'. Let u in K be a member of the 
-nullspace of (F*=-k) but not of (F*==*)+-Fher the equation 


(T* -p v=u 


336 ; TRACE CLASS AND TRACE FORMULA 


has no solution; for a solution v would belong to the nullspace of (T* — ryt! but not 
to that of (T* —2)!. According to the Fredholm alternative,theorem-8-of chapter-21,———_ 
there must be an eigenvector w of T, (T — 2) w = 0, that is not orthogonal tou. But 
this is a contradiction, for u belongs to K, and so u is orthogonal to all eigenvectors 
of T. f D 


If T is of trace class over H, so is T*. We claim that-T*-restricted-to-its-invariant--==~.-- 
subspace K is.of.trace class—this.follows immediately from. relation (6) character-... ___ 
izing trace class operators. 

Back to formula (14); the second sum on the right can be rewritten as 


OT SDR) = DCL n) o a 


Since hw is an orthonormal basis of K, this sum is the complex conjugate of the 
trace of T* over K. Its vanishing can be formulated so: 


Lidskii’s Lemma. Let T be a trace class operator that has no eigenvalues, except 
zero; then trT = 0. 


The rest of this section is devoted to proving this proposition. We start with an 
estimate of the eigenvalues of a compact operator in terms of its singular values. 


Lemma 7. Let T be a compact operator, with nonzero eigenvalues i), A2,..., ar-. 
ranged in decreasing order of their absolute value, including multiplicity. Denote as 
before the singular values of T as s; (T), arranged similarly. Then for any N, 


N N i 
[ [ass] [s ®. : (15) 
l I 


Proof. Denote by Ey the space spanned by the first N eigenvectors of T, and 
denote by Py orthogonal projection onto Ey. Denote by Ty the restriction of T to 
the invariant subspace Ey. Denote by Ay the absolute value of Ty: 


Ty =UnAn. (16) 


Since the eigenvalues à; are nonzero, Ty is invertible. Then so is Uy, and therefore 
Un is unitary. Taking the determinant of (16) gives 


|det Ty | = det Ay. 


Since the determinant of a matrix is the product of its eigenvalues, we can rewrite 
this identity as 


N N 


PPa = [a An). (17) 


] l 


THE TRACE FORMULA 337 


The operator T Py acts on Ey as the matrix Ty; on the orthogonal complement 
of Ey, TPy = 0. It follows that the absolute value of T Py is Ay on Ey, zero on 
Eğ. It follows that à j (Ay) =5j(TPy). j =1,.... N. We appeal now to inequality 


(5y; s; (TB) < ||B\| s;(T). Apply this to B = Py: we obtain 
sj (TPN) < sj(T). (17') 


Since we have shown that à; (An) = sj(T Py). setting (17’) into the right side of 
(17) yields (15). Oo 


We can deduce further inequalities between | ;| and s; with the aid of the follow- 
ing simple principle: 


Lemma 8. Leta, > a > --- and by > ba > --- be two decreasing sequences of 
real numbers, satisfying for each N 


N N 
yas bj. (18) 
l | 


Let F be a convex function defined on © that tends to zero as its argument tends to 
—oo. Then 


N N 
>, Flaj) s D> F (bj) (18’) 
| | 


for every N. 


Proof. The set of functions F described above form a convex cone. It was shown 
in chapter 14, section 14.2, that the extreme rays of this convex cone are the piecewise 
linear one: 


0 forx <z 
w-—z forzs <x, 


Fi =| 


z an arbitrary real number. For this choice of F inequality (18’) can be reducedto 


B Q 
- Jalaj ~2) sD (bj ~a). (19) 
l | 


where 


aj2=z forj <P, aj<z forj>P 


and 


bj zz forj <Q. b; <z fory > QO. 


. 338 TRACE CLASS AND TRACE FORMULA 


To verify (19), we observe that the right side can be characterized as 


z M y ? 
Tarii OER tee Ge 56 at onc E er bate on SS-(bpezjrccc bic alaa An eee as 
cake max ra 7=7) 


For M = P, Yf (bj — z) is, according to (18), greater than the left side of (19), so 


` eyèñ more so for thé maximum. 


of the points on the extreme rays. Since both sides of inequality (18’) are linear 
functions of F, and since (18’) holds for every extreme F, it holds for all F. B 


 Wëapply lemma sto 7 : 


aj =log|A;(T)|, bj = logs;(T); 


taking the logarithm of (15) shows that inequalities (18) are satisfied for this choice. 
Choosing F(x) = e*, we deduce from (18) that 


N N 
EUD s Ys. (20) 
] 1 
Choosing F(x) = log(1 F re*), r > 0, gives 
N N 
[[a4+rajyp <[]atrs). (21) 
I 1 . 


To estimate the trace of T, we approximate T by finite-dimensional projections. 
Let {hn} be an arbitrary orthonormal basis of the Hilbert space; denote by Py or- 
thogonal projection onto the span of h4, ---, hy. Denote by Ty the projection of T 
onto the range of Py: 

Ty =PyTPy. (22) 


Lemma 9. Suppose that T is a trace class operator that has no nonzero eigenvalue. 
Denote Ty as above. Then 


(i) Ty approaches T uniformly: 
lim |Ty -T| =0. 


(ii) imtrTy = trT. 
(iti) denote the spectral radius of Ty by on; on tends to zero as N —> oo. 


Proof. (i) is true for any compact operator T, and (ii) is the definition of trace. 


THE TRACE FORMULA l 339 


Gii) By assumption T — A is invertible for every à # 0. Given any ô > 0, denote 
- m(5) = m the quantity 


= T-A)! 
ear it Ke "ll 
By (i) we can choose M (ô) so large that for N > M(6), 
1 
ITv — T] < —. 
m 
For such N and |A| > 6, (Ty — T)(T — à)! has norm < 1, so 


Ty -A=Ty-T+T-A= [Ty - DTA"! +1] (T-a) 


is invertible when |A| > 6. Therefore oy < ô. o 
Denote the eigenvalues of Ty as ee j=1,..., N. Denote by Dy the polyno- 
mial 
N (N) 
oe = r 
Dya =] T(1-¥j ys (23) 
Lemma 10.. 
lim DyQ)=e 7", a =trT, 
N> 


‘uniformly on'every bounded set of complex numbers À. 


Proof. Take the logarithmic derivative of (23): 


(N) 
Pas T DER 
hatanata DN ea] pany 


Diy =dDy/di. 


Since each a] is < oy, we can for |A| < 1/oy expand each term on the right as 
. a geometric series: 


Dy oo Š k] ee a ; 

i on Hak = i— F 

a ee ai a AE (24) 
i k= P 


h 


(N) _ (N)k 
S; pe i 
J= 


340 TRACE CLASS AND TRACE FORMULA 


For k > 1 we estimate St . crudely, as follows. Since each ye ŠON. is Wes < 


ay a SIA Hee We apply now inequality (20) ) to-Ty-and-deduce,-using theorem 2 


- that 


el sof" Tre sok! Te. Bai (25) 


For k = 1 we have 


Ai Si E nr a 


We rewrite (24) as 


Di 7 
I a ETETE ss >a ra se areca ear 
2 


Taking absolute values and using (25) and (25’). we get, for JA] < 1/o,' summing 

the geometric series, l 

N ` 

< |e T—Twl+ Ba (Tl. 
— |Alon 


Now let N — oo. Using parts (ii) and (iii) of lemma 9, we conclude that . 


+r F 


Dy 


t 


li Bk rT) =0. 
dim pot 


Nx 


uniformly for all à in a compact set. Pemi this relation with respect to 2 and 
using Dy (0) = 1, we deduce lemma 10. D 


We use now the definition (23) of Dy to estimate | Dy (å)| as follows: 


A Ne 


N 


[Daas] (1+ 


l 


Using inequality (21) with r = |7.| applied to the operator Ty. we see that the right 


side above is less than 
[[U +s; Tw). 


According to inequality (17%). sj (Ta) < si(T). so we get the inequality . 


N 
IDK (A < FH (1+ |2|sj(T)). 
| 
Letting N — oo and using lemma 10. we obtain 


~ 
|e pmi <J] 1+ lts; (T). 
l 


THE DETERMINANT 341 


Using the inequality | +r < e” on all but the first M factor on the right gives 


M x 
< [[ (1+ lal sj) exp (a 2 s) = Py (A eP. (26) 
| 


M--1 


pete 


where Py is a polynomial of degree M, andey = oMi Sj. 

Now choose the argument of à so that —A g is positive, and let |À] tend to infinity. 
Since a polynomial grows more slowly than any exponential, we deduce from (26) 
that jæ] < em. Since €y tends to zero as M tends to infinity, it Follows that œ = 0. 
Using lemma 10. tr T = —a@ = 0; this completes the proof of Lidskii’s lemma, and 
thereby of the trace formula. z 

The proof presented above for the trace formula is due to Gohberg and Krein. 
Lemmas 7 and 8 have been derived by Hermann Weyl. Lidskii’s proof relied on the 
Hadamard factorization theorem for entire functions of exponential type. 

Lidskii defined trace class by forming the linear span with complex coefficients 
of self-adjoint trace class operators. In Dunford-Schwartz trace class operators are 
defined as the product of two Hilbert-Schmidt operators; see section 30.8. The trace 
formula appears in Duntord-Schwartz, but there is no reference to Lidskii. Under 
questioning, Jack Schwartz admitted that he discovered and proved the trace formula 
independently. 


30.4 THE DETERMINANT 


In this section we sketch the definition of the determinant of operators of the form 
I+ T.T of trace class, and its fundamental properties. A full discussion is given in 
Gohberg, Goldberg, and Kaashoek. 

For degenerate operators, G, those with finite-dimensional range, the definition is 
taken from linear algebra. Let G act on a Hilbert space H, and K a finite-dimensional 
subspace of H that contains the range of G. With respect to any orthonormal basis of 
K,1+G can be expressed as a matrix; the determinant of this matrix is independent 
of the choice of the orthonormal basis. or of the subspace K. It is defined as the 
determinant of the operator I + G. The determinant has the usual properties: 


det (I+ G)(1+F) = det (I+ G) det (1I + F), (27a) 
7 eee oe a 
det + GyY= [Ja +a (27b) 


where A; are the eigenvalues of G acting on K. including multiplicity. For different 
choices of K we may get a different number of EPET that are zero; clearly that 
doesn’t change the right-side of (27b)... 

Every trace class operator T can be apprommnted a in trace norm by degenerate 
operators; for instance, take the polar decomposition of T = UA, and approximate 


342 -TRACE CLASS AND TRACE FORMULA 


A by AN = APy, where Py is projection onto the space spanned by the first N 
eigenfunctions of A. Clearly, by definition of trace norm, ||T — UA y ll tends to zero 
as N tends to oo. The following results holds about such approximations: 


Lemma. Let T be a trace class operator, and Ty a sequence of degenerate opera- 
tors tending to T in trace norm. Then det (I+Ty) tends to a limit that is independent 
of the choice of the sequence. This limit is defined as det (I+ T). 


The main result about determinants as defined above is that the two properties 
listed in (27) are retained. For proofs consult GGK. l f 

In chapter 24 we presented Fredholm’s theory. based on the notion of the deter- 
"minant of öperátors of fétm FF K, Ka one=dimensional-integral-operator with-con-———-—— 
tinuous kernel. As we will show in section 30.6, not all operator K of this form are 
of trace class; so the notion of determinant can be extended beyond the one sketched 
in this section. 


30.5 EXAMPLES AND COUNTEREXAMPLES 
OF TRACE CLASS OPERATORS 


In this section we will study one-dimensional integral operators K of the form 
1 
(Ku)(s) = f K(s, t)u(t) dt. (28) 
0 


acting on the Hilbert space H = L7[0, 1]. 

Many—one might say almost all—bounded operators that interest us are integral 
operators in one or several dimensions with kernels that may have singularities. In 
this section we will treat mostly kemels that are continuous functions. Recall that in 
chapter 24 we have shown that such integral operators are compact maps of C[0, 1] 
into C[O, 1]. 


Exercise 5. Show that an integral operator with continuous kernel is a compact map 
of L7[0. 1] into itself. 


The adjoint of K* of the operator (28) is another integral operator whose kernel 
K* is the conjugate transpose of K: 


K*(s,1) = K(t.s). 
Clearly, K is a symmetric operator iff its kernel is skew symmetric, that is, if 
K* = K. 
For symmetric integral operators the spectral theory developed in chapter 28 is ap- 


plicable: K has a complete set of orthonormal eigenfunctions e; and real eigenvalues 
kj accumulating at zero: 


Ke; =k; ;. (29) 


EXAMPLES AND COUNTEREXAMPLES OF TRACE CLASS OPERATORS 343 


Since K maps L? functions into continuous functions, every eigenfunction ej, with 
xj Æ 0, is a continuous function. When the kernel is real, the eigenfunctions can be 
chosen to be real valued. 

The following remarkable result was proved early in the game, in 1909, by Mer- 
cer: 


Theorem 11 (Mercer). Let K be a real-valued symmetric, continuous function of s 
and t. Assume in addition that the operator K in (28) is positive in the usual sense: 


(Ku, u) > 0 foralluin H. 
Then the kernel K can be expanded in a uniformly convergent series 
K(s,t) = > «jej(s)ej(t), (30) 
where xj and ej are the eigenvalues and normalized eigenfunctions of K. 


Proof. The key fact is the elementary observation that the kernel of a positive 
integral operator is nonnegative on the diagonal. To see this, suppose, on the contrary, 
that for some r, K (r,r) were negative; then K(s, t) would be negative for s, £ close 
enough to r, and then 


(Ku, u) = Í K(s,t) u(t) u(s) ds dt 


would surely be negative for all functions u that are nonnegative and whose support 
lies close enough to r. 
Define the degenerate kernel Ky as the partial sum of the series on the right in 
(30): 


Nis. a 
Ky(s.t) = kj ej(s)e;(t), 
1 


and denote by Ky the integral operator with kernel Ky. Clearly, the difference K — 
Ky is a positive operator, for its eigenvectors are ej, and its eigenvalues x; j > N, 
and zero. Therefore its kernel K — Ky is nonnegative on the diagonal: 


: N 
0< K(s,5s)—- ` Kj; (S). Gly 


This proves that the partial sums of the infinite series 


yj els) (30) 


are uniformly bounded by Ķ (s, s). Since each term is nonnegative, it follows that the 
series (30’) converges for each s. Since the partial sums form an increasing sequence 


344 TRACE CLASS AND TRACE FORMULA 


of functions. by Dini’s theorem convergence is uniform for all s in (0. 1]. Using (31), 
we can. by the Schwarz inequality, estimate the remainder of the series on the right 
in (30) and prove its convergence uniformly for all s and z. 
Call this limit Koo: we claim that Ko, = K. To see why, denote by Ko, the 
integral operator whose kernel is Kos. From the definition of Kos as the right side of 
(30), we see that ej is an ‘eigenfunction of Kas. with eigenvalues «x j. Thus Kand Kas 
“act the sameway on'all ej, and therefore on all their linear sembinatione: Since both 
~ —— Kant Ka map-funetions -orthogonal to all-e; into zero, it follows that Ku = Kacu 
for all functions u. But then K and Kas have the same kernel. g 


Exercise 6. Show that an integral operator whose kernel is continuous and = = 01 is 
ZO. a fades myer Morea 


Set in equation eye s =f and integrate; we get 


[ xe. s)ds =} Kj. (32) 


Since the eigenvalues of a symmetric positive operator are its singular values, we 
conclude 


Corollary 11A. An integral operator that satisfies the hypotheses of Mercer's Hee 
rem is of trace class. 


Corollary 11B. The trace of an integral operator that satisfies the hypotheses of 
Mercer’s theorem equals the integral of its kernel along the diagonal. 


Formula (32) holds much more generally: 
Theorem 12. Let K be dn integral operator of form (28), of trace class, with a con- 
tinuous kernel. Then the trace of K equals the integral of its kernel along the diago- 
nal. 

Proof, We take first the case that the kernel is not merely a continuous function 


but a smooth one. Then the kernel can be expanded as a uniformly convergent series 
of, say, Lagrange polynomials fn: 


K(s.t)= Y jun fj (s) fin (1). 


where the coefficients Kj. m are given by the usual formula for orthonormal expan- 
sions: 


K jam = p K(s.t) Fils) fa) dsdi. 


We use now definition (9) of trace with the preceding orthonormal basis fn: 


EXAMPLES AND COUNTEREXAMPLES OF TRACE CLASS OPERATORS 345 


(K fn. Jn) = Í (f K(s. t) falt) ar) Fils) ds = Kan 


according to the formula above for the coefficients k j.m. Summing with respect ton 
gives 


wK = J kna 


On the other hand, setting s = f in the series for K gives 


K(s,s)= X kim fi (s) fin(s). 


Integrating with respect to s and using the orthogonality of the fj gives 


| K(s,s)ds = > kins 


identical with the expression derive above for the trace of K. 
To handle integral operator whose kernel is merely continuous, we approximate 
them with operators with smooth kernels. We need 


Theorem 13. An integral operator with a smooth kernel is in trace class. 
Proof. If K has a smooth kernel, so does K*. and so does K*K. We will estimate 


the nth eigenvalue A, of K*K = L. Since L is symmetric, we can apply Courant’s 
principle described in chapter 28: 


(Lu, u) 
An = min max =, 
Sy-1 ULS,—-1 (U, tt) 


It follows that for any given subspace S,,_; of dimension n — 1, 


(Lu, w) 
max — ; 
uts,- (u.tt) 


An S (33) 


We choose 5;,—1 to consist of all polynomials of degree < n — 1. Then for uLS,—). 


(Lu. u) = = = [fe Lis, t) u(s) ult) ds dt = J- [L(s.t) — Pals, t)Ju(s)ult)ds dt. 


where P, is any function of form 
n-2 


Py(s,t) = Y aj(s) +b si. 
0 


According to results in approximation theory, every smooth function L(s. t) can be 
well approximated by such functions in the L>-norm: 


346 TRACE CLASS AND TRACE FORMULA 


2 ee ree I IL — Pal? dsdt < const nP, 


where the exponent b is proportional to the number of continuous derivatives pos- 
sessed by L. So, by the Schwarz inequality, 


2 


TTU Pa) Kou asar| < Í (L — P,)? dsdt Í u? (s) u>(t)ds dt 


< const. n? 


for àll u orthogonal to Sy.) and L? norm equal to 1. It follows from (3 
that A, < const n9/2, Since L = K*K, An = s (K), and so we have the estimate 


Sn (K) < const. eae 
Clearly, for b > 4 the series )~ sn (K) converges. l 0 
To approximate K(s, 7) by smooth kernels, we employ mollifiers. Let p(s) be a 


nonnegative C° function of compact support, and f pds = 1. We define p(s) = 
n p(ns), and the mollifying operator M, as convolution with py: 


(Mn) (s) = | Pn(s—r)u(r) dr. 
Define K, as M, K Mn. K,, is an integral operator whose kernel is the convolution 
Ky(s,t) = 1 Pn(s — r) K (r. x) pnw — t) dr dx; 
Ka (s, t) is a C? function that tends uniformly to Ķ (s, t) as n — œ. According to 
theorem 2, the trace class operators form a two-sided ideals in the ring of all bounded 


operators; since K in theorem 12 is of trace class, so is K, = Mn KM,,. According 
to what we have already shown, 


trK, = f Kuls,sias. 


An n tends to 00, the right side tends to f K(s.s) ds. To complete the proof, all we 
have to show is that the left side tends to tr K. We leave this as 


Exercise 7. Show that limtrK, = trK. (Hint: Prove it for K degenerate, ‘and then 
upproximate K in trace norm by a sequence of degenerate operators.) o 


We encountered earlier, in chapter 22. the operator of integration: 


s 
vow = f u(t) dt. 
0 


EXAMPLES AND COUNTEREXAMPLES OF TRACE CLASS OPERATORS 347 


We showed there that V is a compact operator mapping C0, 1] into C(O, 1]. It is 
equally true that V is a compact mapping of L7[0, 1] into L7{0, 1). 


Exercise 8. Show that V maps the unit ball in L?[0, 1] into a compact subset of 
C(O, 1]. 
Note that V is an integral operator, with the discontinuous kernel 


] forr<s 
RSS {0 fort >s. 

We showed in chapter 22 that V has no eigenfunctions in C[0, 11. Since V maps 
L?{0, 1] into C[0, 1], it follows that V has no eigenfunctions in L?{[0, 1] either. We 
show now that V is not of trace class, by computing its trace with respect to the 
trigonometric base: f(t) = cos(27nt), gy = sin(2x nt). By calculus, for n 4 0, 


while (V fo) (s) = s. Again, by calculus 
(Vfo. fo) =4. (Vins fa) =0 forn#0, (Van. 8n) =0. 


So the trace of V with respect to the trigonometric base is 5 contradicting Lidskii’s 
lemma. 0 


Exercise 9. Calculate the singular values of V and show that })sj;(V) diverges. 
(Hint: The inverse of V*V is a differential operator.) 


We remind the reader that we showed in chapter 24, theorem 5, mat an integral 
operator whose kernel is Hölder continuous, with Hélder exponent > i, satisfies the 
integral form of the trace formula. 

We close this section by asking—and answering—the following question: Given 
an integral operator K, how can we decide if it has any nonzero eigenvalues? If 
the operator is known to be of trace class, we can find its trace as the integral of 
its kernel along the diagonal; if this is nonzero, then by the trace formula, K has 
a nonzero eigenvalue. If trK = 0, no conclusion can be drawn. We can then look ° 
at K2, an integral operator whose kernel can be computed from that of K. It is of 
.... «trace Class,-and its. trace.can.be computed. by.integration_If. trK2.#0, then Khasa — 
nonzero eigenvalue; otherwise, we inspect the trace of K?, and so on. What if this 
process never ends? 


Theorem 14. Let K be an integral operator with a continuous kernel, of trace class. 
Suppose tr K” = 0 for all positive integers n; then K has no nonzero eigenvalues. 


Proof, Denote the nonzero eigenvalues of K by «j. The eigenvalues of K” are Kh, 
so by the trace formula 


348 TRACE CLASS AND TRACE FORMULA 
> Sr anc: 
Low 
i 
It follows from inequality (20) that }°|«j| < |K]. We build the entire analytic 
function 


F(z) =) (e 1). (36) 


Since Je“ ~T] < elw] for |w| < 1. the series (36) converges for all z. The Taylor ` 


coefficients of F at z'= 0 can be computed by differentiating (36) termwise: 


_F(0) = 0... E0 = SK” 


Lagh je =e ceee ee 


It follows from the assumption that tr K” = 0, and so by (35) that all the Taylor 
coefficients of F are zero, and therefore F(z) itself is zero for all z. We claim that 
then all xj are zero. Suppose not; let x], .... xj be those of largest absolute value. 
Choose z so that «ız is real and positive, and let |z] — o0. Clearly, the first term in 
(36) dominates all others. and F(z) = m elk ll] where m is the multiplicity of x). 
This contradicts F(z) = 0. G 


30.6 THE POISSON SUMMATION F ORMULA 


It this section we study integral operators of convolution form. Let f be any contin- 
uous function on the circle S!. Ty = T is defined as convolution with f: 


(Tu)(s) ai f(s —tou(t)dt/2z. (37) 

: 5! 
As we saw in chapter 29, the eigenfunctions of T are the exponentials en (t) = eÍ nt ; 
Te, = | fis rei dipa = | fiye!" dr /2n ei” = ay ef" (38) 


So-the eigenvalues are:the: Fourier coefficients an of f- 


The kernel of the integral operator T equals f (0) at every point on the diagonal; 


so if T were a trace class operator, rT = f(0) = J` an, by the trace formula. This 
is the same as saying that the Fourier series of f at s = 0 converges to f(0), true 
for sufficiently smooth functions but not for all continuous functions; see chapter 11, 
section 11.2. This shows that not all integral operators with a continuous kernel are 
of trace class. 

Consider functions g defined on the whole real line, smooth and decreasing 
rapidly as |s| tends to infinity. Define now the operator T as convolution: 


(Tu)(s) = [ els — tul) dt/27. (39) 


fi 


HOW TO EXPRESS THE INDEX OF AN OPERATOR AS A DIFFERENCE OF TRACES 349 


regarded as a mapping of L2(S!) into L7(S!). We can put this in the form (37) by 
chopping up R as the union of intervals [2x m. 27 (m + 1)] of length 27: 


(Tu)(s) = | \ “als —t+2am)u(t)dt/27. (35') 
sia 


The eigenvalues of T are given by formula (38), which can be rewritten as follows: 


dn = A Yost 42am) e" dr/2a = i g(r)e!™ dr/an = &(n)/27, 
Ss = 


where @ is the Fourier transform of g. The kernel of the integral operator T equals 
S g(2m) at every point on the diagonal. Therefore the trace formula asserts that 


$ elam) = 2a Yin). 


This is the classical Poisson summation formula. 

The scope of this argument can be enlarged beyond this simple case. The additive 
group of the reals can be replaced by other, not necessarily commutative, groups, 
and the group of integer multiples of 2x by other discrete subgroups. The celebrated 
Selberg trace formula is a far-reaching generalization of the Poisson summation for- 
mula. i 


30.7 HOW TO EXPRESS THE INDEX OF AN OPERATOR AS A 
DIFFERENCE OF TRACES 


We recall from chapter 27 the notion of the index of a bounded mapping F of a 
Banach space U into another Banach space V. Denote by N the nullspace, by R the 
range of F; assume that the dimension of N and the codimension of R are both finite. 
Their difference is defined as the index of F: 


ind E = dim N — codim R. © (40) 


According to theorem 1 of chapter 27, an operator F : U — V has anindex.iff it has... 
a pseudoinverse G : V — U. such that 


GF=1-T, FG=I-S. (41) 


where T : U — U and 5 : V — V are compact maps. In this section we study the oe 


case when U and V are Hilbert spaces. and T and S are not merely compact but are 
of trace class? e. 3 


Theorem 15. Let U and V be a pair of Hilbert spaces, F : U — V and G: V — U 
bounded operators that are pseudoinverses of each other in the sense of (41), where 
T: U > UandS: V — V are trace class operators. Then — _- ™ 


ind F =trT —trS. (42) 


wee 


350 TRACE CLASS AND TRACE FORMULA 


Proof. Multiply the first relation in (41) by F on the left, the second relation by F 


9n the right, and subtract, one from the other: 


FT = SE. (43) 


Decompose orthogonally U and V as follows:” 


Peni UsN@Z, V =RGW —. AE FET: 


Define P as the orthogorial projection of U onto Z. Since the orthogonal complement 
of Z is the nullspace N of F, it follows that FP = F. Setting this into (43) gives 


FPT = SF. (44) 


Note that PF maps Z —> Z, S maps R - R, and F is an invertible map of Z — R. 
-We claim that 


trPT/Z =trS/R. l (45) 


where tr PT/Z means the trace of PT restricted to the invariant subspace Z, and 
so on. È j 


Proof. Choose any vit map M of R ae Z. Multiply (44) on the Jeft by M: 
MFPT = MSF = MSM! MF. 
Multiply this by (MF)~! on the right: 
(ME)(PT)(ME)~! = (MSM). 


All operator in parentheses map Z — Z. So by the commutative property of trace. 
theorem 4 (iv), 


trPT/Z =trMSM7!/Z =trS/R. o 


We express now the trace of T over U in terms of the trace of PT over Z. Build an 
orthonormal basis for U consisting of a basis {7 j} in N and a basis {zj} in Z. Then 


tr T = XTn; nj) + ys hey, zj). 


Since Pz; = zj, we can rewrite the second sum on the right as 


X Tej Pzj) = > APT 2). zj) = trPT/Z. 


HOW TO EXPRESS THE INDEX OF AN OPERATOR AS A DIFFERENCE OF TRACES 351 


On the other hand, F = 0 on N, so it follows from the first relation of (41) that T = I 
on N. Therefore 


X (Tnj,nj) = Xaj, nj) = dim N. 
Putting together the last three relations gives 
trT = dim N +trPT/Z. (46) 
Similarly, we express the trace of S over V in terms of the trace of 5 over R. We 


build an orthonormal base for V consisting of a basis {wj} for W and a basis rj 
for R. Then 


trS= $ (Sw; wj) + yr, rj). 
We identify the second sum on the right as 
X (Srjry) =trS/R. 


It follows from the second relation in (41) that the range of I — S lies in R, and 
therefore is orthogonal to W. In particular, (I— S) wj, w j) = 0, so 


S(Swj, wj) = > (wj, wj) = dim W = codim R. 
Putting together the last three relations gives 
trS = codim R + trS/R. (46°) 


Subtract (46’) from (46); since we have shown in (45) that tr PT/Z = trS/R, we 
obtain the trace formula (42) for the index. o 


Even when G is too crude a pseudoinverse for F to make S and T of trace class, it 
could happen that for some positive integer 7, S” and T” are trace class: 


_ _ Corollary 15’. The spaces U, V, the operator F, G are as in theorem 15, and S" 
and T" of trace class, n some positive integer. Then 


indF=trT’—trS".. © (47) 
Proof. We replace the pseudoinverse G by Gn = ( = T/)G. Then 


n=l È n=} 
G,F = (a) GF = (Zr) (I1-T) =1-T", (48) 
0 


0 


352 TRACE CLASS AND TRACE FORMULA 


where we have used the first relation i in e ). Using both relations in (41) we deduce 


n=l > n=] = ES S | sae oe i 
FG, =F (Ee G =FG s{E i (I-S) -0i(Ss) =I-S".  (48') 
0 


Now we apply theorem 15. - SARE Se iS 3 weet tee eed] tee 


“If fonnili (47) holds ra one wale of n, it icholds for all larger ones. This seems 
peculiar, until we do 


of T is an eigenvalue of S; with the same multiplicity. 


theatre 15 and its corollary can be very useful in calculating the index of oper- 
ators: see Gilkey. 


30.8 THE HILBERT-SCHMIDT CLASS 


The last exercise of this chapter summarizes the main properties of the Hilbert- 
Schmidt (HS) class of operator in Hilbert space. 


Exercise I1. A bounded linear operator K mapping a Hilbert space H into itself 
belongs to the HS class if for some orthonormal basis {e;} of H, 


Yo WK ex ||? < oc. (49) 


(a) Show that if K satisfies (49) for one orthonormal basis. then it satisfies (49) 
for every orthonormal basis. and the sum in (49) is independent of the basis. 
The square root of this sum is called the H S-norm, denoted as ||K|| ;75. 


(b) Show that |K] < IKI #5- 
(c) Show that if K is H S, so is its adjoint K*. and ||K* |l ys = IIKIl# s. 
(d) Show that the H S operators form a complete normed space in the H S-norm. 


(e) Show that if K is HS and B is any bounded operator, then BK and KB are 
H S, and ||BK|| 7s. ||KB|| 45 are both < IBI] IKI #s. 


(f) Show that K is HS iff > s} (K) < 00. 
(e) Show that every H S operator is compact. . 
(h) Show that every trace class operator is H S. 


(i) Show that the product of two H S operators K and that H is in trace class. and 
IKE] < Bilas |Bllas. 


(j) Show that every trace class operator can be written as a product of two HS 
operators. 


: Exercise-10-Show that if S-and-T-are retared-as-in Hymen each-eigenvalues-1—— 


BIBLIOGRAPHY 353 


30.9 DETERMINANT AND TRACE FOR OPERATOR 
IN BANACH SPACES 


The earliest developments of a determinant theory for operators in a Banach space 
are due to Lezanski in 1953, Grothedieck in 1956, and Sikorski in 1961. The ear- 
liest derivation of the trace formula for a class of operators in a Banach space is 
due to König; a systematic approach has been developed by Pietsch in a’ series of 
publications culminating in his monograph. 

Yet another, still broader, systematic approach is presented in the recent excellent 
monograph of Gohberg, Goldberg, and Krupnik. 


BIBLIOGRAPHY 


Dunford. N. and Schwartz, J. T. Linear Operators: Part 1], Spectral Theory, Interscience-Wiley, New 
York, 1963; see esp. ch. XI. sec. 6. 


Gilkey, P. B. Invariance Theory, the Heat Equation and the Atiyvah-Singer Index Theorem, 2nd. ed. CRC 
Press, Boca Raton, FL. 1995. 


Gohberg. I. C.. Goldberg, S., and Kaashoek, M. A. Classes of Linear Operators. Vol. |. Birkhduser. 
Boston, 1990. 


Gohberg, I. C.. Goldberg, S., and Krupnik, N. Traces and determinants of linear operators. Operator 
Theory Adv. and Appl., 116 (2000). 


Gohberg, I. C. and Krein, M. G. Introduction to the theory of linear nonself-adjoint operators. Nauka. 
Moscow (1965): AMS Trans. Math. Monogr., 18 (1969). 


Grothendieck. A. La théory de Fredholm. Bull. Soc. Math.. France, 84 (1956): 319-384. 


Johnson. W. B.. König, H.. Maurey, B.. and Retherford. J. R. Eigenvalues of p-summing and [p-type 
operators in Banach spaces. /. Funct. Anal., 32 (1979): 353-380. 


König. H. s-numbers, eigenvalues and the trace theorem in Banach spaces. Studia Math.. 67 (1980); 157- 
i71. 


Lax. P. D. The existence of eigenvalues of integral operators. Indiana U. Math, J., 42 (1993): 889-891. 


Leiterer. H. and Pietsch, A. An elementary proof of Lidskii’s trace theorem. Wiss. Zisch. Friedrich Schiller 
Univ, Jena, Math.—Nat. R., 31 (1982): 587-594. : 


Lezanski. T. The Fredholm theory of linearequations in Banach spaces. Studia Math. 13 (1953)+ 244-276. 


Lidskii, V. B. Nonself-adjoint operators with trace. Dokl. Akad. Nauk SSR, 125 (1959): 485—487: AMS 
Trans., 47 (1961): 43-46. ` 


Mercer. T. Functions of positive and negative type and their connection with the theory of integral equa- 
tions Frans-komnlon-Phik-Soe-tAy209-H900 HS 46 


-= Pietsch: As Eigenvalues-and s-Numbers: Cambridge Studies in Advanced Math, 13. Cambridge University 
Press. Cambridge. 1987. 


Retherford. J. R. Compact Operators and the Trace Theorem, London Math. Soc. Student Text, 27, Cam- 
bridge University Press. Cambridge. 1993. à 


Selberg, A. Harmonic analysis and discontinuous groups in weakly symmetric Riemannian spaces. with 
applications to Dirichlet series. J. Jadian Math. Soc., 20.1956). 121-129. 


Sikorski, R. The determinant theory in Banach spaces. Collog. Math.. 8 (1961): 141-198. 


Weyl. H. Inequalities between the two kinds of eigenvalues of a linear transformations. Proc. Nat, Acad. 
Se.. 35 (19-49): 40811. 


31 pa 


SPECTRAL THEORY OF 
SYMMETRIC, NORMAL, 


AND UNITARY OPERATORS 


` In this chapter we study operators M that map a complex Hilbert space H into itself, 
that are bounded, and that are symmetric in the sense that M* = M. According to 
the definition of adjoint this means that for all x and y in H 


o M y) = @, My). a (1) 
Exercise I. Show that 


(a) The inverse of an invertible symmetric operator is symmetric. 
(b) The product of commuting symmetric operators is symmetric. 
(c) The set of symmetric operators is closed in the weak topology for operators. 


In chapter 28 we saw that every compact symmetric operator has a complete set 
of orthonormal eigenvectors. In this chapter we generalize this result to include sym- 
metric operators that are bounded but not compact. To show how to do this, we 
reformulate the spectral resolution of compact operators. 

Denote by {en} the eigenvectors of the compact operator A. Every vector x in the 
Hilbert space H can be expanded in a Fourier series, and so can Ax: 


x = Laney. Ax = DAnanen. (2) 


Denote by En projection onto the eigenspace with eigenvalue ày. Then (2) can be 
rewritten as 


x = DE,x. Ax = DA, Ex. (2') 


We rewrite the sums in (2) as integrals by introducing the projection-valued measure 
E(S) as follows: for any Borel set S of È. 


354 


SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS 3 


E(S) = bY En. 


hnes 


Ur 
ur 


The support of the measure E is the spectrum of A. Using the measure E defined 
above, we can rewrite (2’) in the form 


x = f axons, Ax = [rox (3) 


Our objective in this chapter is to obtain a spectral resolution of form (3) for 
arbitrary bounded symmetric operators M. The projection-valued measure E that 
enters the resolution is, of course, no longer a pure point measure. 

The following result is as basic as is simple: 


Theorem 1. For B bounded symmetric, (Bx, y) is a bounded, skew symmetric form, 
linear in x, skew linear y. 
Conversely, let b(x, y) be a skew symmetric form 


b,x) =b. y), (4) 
linear in x, and bounded: . 
lb, p < elix. (5) 
Then b can be expressed as 
b(x, y) = (x. By) (6) 


where B is a bounded symmetric operator, and 
Bll < c. . (7) 


Proof. The direct part is a consequénce of the symmetry of B, the Schwarz in- 


equality, and the boundedness of B. To deduce the converse, we fix y and regard 
b(x, y) as a bounded linear functional of x, with bound clly|]. According to the Riesz- 
Frechet representation theorem, we can write this functional as a scalar product with 


where w is uniquely determined by y: setting x = w in (5) shows that |w] < clly{. 
Since the left and right sides of (8) depend skew linearly on y and w, respectively, it 


follows that w is a linear function of y: 


w = By. 


356 SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS 


This proves (6) and (7). The symmetry of B is a consequence of skew symmetry of b: 
(x. Bx) = b(x. x) = b(y. x) = (x. Bx) = (Bx, y). 


Note that (x. Bx) = (Bx. x) is real for all x. o 


31.1 THE SPECTRUM OF SYMMETRIC OPERATORS 


Theorem 2. The spectrum of a bounded, symmetric operator M on a Hilberi space 
is real. oo 


Proof. We have to show that every nonreal A=a-+iB, B = 0. lies in the resolvent 
set. Define the function B as follows: 
B(x. x) = (x. (M —2)y). 
This function has all three properties listed as the hypotheses of theorem 6 of chap- 


ter 6: 


(i) B is linear in x, skew linear in y. 
Gi) B is bounded: for by the Schwarz inequality 


|B(x. y)| S IxM — 2) yl < Ute aMI + 2p. 
(ili) B(x. x) is bounded from below, 
By. y) = œ, (M — A)y) = (y, My) — a(x. y) — ip (x. y). 
The first two terms on the right are real, the third imaginary; so 


IB(y. D| > [im BOY.) = LB lily IP. 


We appeal now to the Lax-Milgram lemma, theorem 6 in chapter 6, which asserts 
that every linear function (x) can be represented as B(x. x) for ‘some v. uniquely 
determined by £. Take t(x) = (x, z): there is a uniquely determined v such that 
B(x. y) = (x. z) for all x. Using the definition of B as (x, (M — A)v), we conclude 
that (M—2)\ = z. Since z is arbitrary. this shows that (M—A) is invertible. Therefore 
2. belongs to the resolvent set of M. Oo 


NOTE. Theorem 15 of chapter 19, stated and proved in the context of Gelfand’s 
theory of commutative 5* algebras, implies theorem 2. 


Theorem 3. The spectral radius of a bounded, symmetric operator M is equal to its 
norm: 


io (M)| = IMI. (9) 


THE SPECTRUM OF SYMMETRIC OPERATORS : 357 


Proof, Using the symmetry of M, the Schwarz inequality, and the definition of the 
norm of M?, we derive for any x. 


Mic] = (Mx, Mx) = (x. M2x) < [IMP < AIME. 
It follows that IMI? < ||M2?||. Repeating this argument k times, we deduce that 
My" < WM" I. na 2k, 


Since norm is submultiplicative, the opposite inequality holds also. Therefore. 
Mj" = IM" |. Taking the nth root, and using formula (12’) in chapter 17 for 
the spectral radius, gives 

lo MDI = lim JM" !/" = Mi z 


Theorem 4. The spectrum of a bounded symmetric operator M lies in the closed 
interval [a,b] on the real axis, where 


a= inf (x, Mx), b = sup (x, Mvr). (10) l 
eq=1 hri=! 


The endpoints of this interval belongs to the spectrum of M. 


Proof, Let À be real and less than a. From the definition (10) of a it follows that 
for every x in H. 


Cv. (M— Ajx) = (x. Mr) — A(x. x) > (a — ADEE. 


It Follows that (x, (M —A).x) yields a norm equivalent to ||x |]. Therefore every linear 
function (x) = (x, z) can be represented uniquely as the associated scalar product 
(x. (M — A)y). Since this holds for all x. (M — A)y = z. Since z is an arbitrary 
element of H. this proves that M — å is invertible, and so à does not belong to the 
spectrum of M. We can deal similarly with A > b. 

To show that a and b belong to the spectrum, we observe that for |x] = 1. 
(Cx, Mv) | < [le | Mec < IMI. Therefore by definition (10) of a and b, 


la] < WMI. Jo) < IMI. dD 


On the other hand, since the spectrum lies in the interval [c, b], 


lo (M)| <max lal. [bjen m m re De 


According to (9), |e (MI)| = || MII; comparing (11) and (11’), we see that this can be 
only if max ja]. |b| = Jo (MD)|. In particular, if b > jal. then b lies in the spectrum of 
M and if |a] > b. then a lies in the spectrum of M. Replacing M by M + cl. with 
c any constant. we add c to the spectrum of M., as well as to u and b. Applying. the. 
results above to the operator M + cI, and choosing c judiciously we conclude that 
both « and b belong to the spectrum of M. 2 


358 SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS 


Theorem 5. Let M and N denote a pair of symmetric operators. Then 


dist (o (M). o (N)) < |IM—NIl, (12) 


where the distance of the two closed point-sets o (M) and o (N) is defined as the 
larger of the two quantities —__ i 


amaxo MIn fv — pel, - . -Max min |v — ul. 13) 
yan oN ein zt) | wing vine) ú is 


Proof. Denote |M -N || by d. Suppose that one of the quantities in (13), say the 
first, is > d. Then for some v ina (N), 


min u-v] >d. 4y 


u ino(M)}) 


Such a v belongs to the resolvent set of M, and so M — v1 is invertible. According to 
the spectral mapping theorem 


o((M — vD!) = (oM) —v)7!. 
It follows from this and (14) that 
lo(M =v! < a7}, l © (15) 


Since (M — vI)~! is a symmetric operator, its spectral radius equals, according to 
theorem 3, its norm. So it follows from (15) that | (M — vT! || < d7!. 
Next we decompose 


N-vI =M- vE +N -M= M — VDA + M -— vI)! (N — M)). 


The second factor on the ri ght is of the form I +K, K = (M — vI) ~! (N — M). Using 
the estimate above for (M — vI)~! and that |[N — M]] = d, we conclude that 


IKI < IM -vD IIN -MI <a7'd =1, 


It follows that the second factor I + K is invertible by the geometric series. The first 
factor (M — vI), too, is invertible, so is their product N — vI. But this contradicts v 
being in the spectrum of N. 0 


31.2 FUNCTIONAL CALCULUS FOR SYMMETRIC OPERATORS 


In chapter 28 on compact symmetric operators we first constructed a spectral res- 
olution, that is, a complete orthonormal set of eigenvectors, and then used these 
eigenvectors to define f(M) for any bounded function f. For general symmetric op- 
erators we proceed in the reverse order: first we build a functional calculus f (M) for 
all real-valued functions f continuous on the spectrum of M, and then we use this to 
construct a spectral resolution. 


FUNCTIONAL CALCULUS FOR SYMMETRIC OPERATORS es 359 


Let q be a polynomial with real coefficients: 
q(A) = anà” +--+ + ag. 
Then, if M is symmetric, so is 
q (M) = anM" +--+ +apl. 
According to the spectral mapping theorem, theorem 5 of chapter 17, 
o(q(M)) = q(o(M)). (16) 
Combining this with formula (9) of theorem 3, we deduce that 


M)|| =. max_|g(A)|. ý 
lq(M) I oe EVI (16% 


Let f(A) be any continuous real-valued function on o (M), the spectrum of M. We 
can approximate f(A) uniformly by polynomials on ø (M), for we can extend f con- 
tinuously to an interval containing o (M). Now according to the Weierstrass approx- 
imation theorem, f can be approximated uniformly on this interval by polynomials. 
So there is a sequence {qn}, such that 


lim sae A gn(A)| = 0. 


NO Aco (M 
It follows that {q,} is a Cauchy sequence: 


lim o max |gn (À) — qm (A)| = 0. 


m,n 
It follows then from (16’) that . 


Pe ee lgan M) — qm (MD || = 0. 


Since the bounded operators are complete, liMmn— so qn (M) exists. We denote this 
limit by f(M). Its properties are summarized in 


Theorem 6. f — {(M) is an isometric isomorphism: 


~ (i) (fF +9) = - : f(M) + gM), (fa) = f(M)g(M). 
(ii) FADI = maxs | fA). 
(iii) f (MD is symmetric and o ( f (M)) = f (o (M). 


Proof. Property (i) holds when f and g are polynomials: therefore they hold for 
uniform limits of polynomials. 

(ii) Since f (MD) is the uniform limit of gn(MD, IZADI = tates lan MDI. 
Since f(A) is the uniform limit of qn (À), 


360 SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS . 
max | f(A)| = lim max ig, (A) ; 
aM) If | n— 50 o(M} Ign | 


These two together, combined with (16’), give (ii). 

(iii) The adjoint of a limit of maps is the limit of their adjoints. Since each p(M) 
is symmetric. it follows that so is f(M). Since f(M) is the uniform limit of gn (M). 
it follows from (12) of theorem 5 that ø ( f (M)) is the limit of o (gy (M)). Thus (iii) 


_ follows from (16). This completes the proof of theorem 6. a f png 


Recall from chapter 18 that a symmetric operator M is called positive if 


a ce (NENT XY SO for all in A. Cp 


Theorem 7. A bounded symmetric operator is positive if and only if its spectrum 
contains only nonnegative numbers: 


a (M) > 0. 


Proof. (i) Suppose that o (M) > 0. The function f(A) = Vi is continuous for 
2 > o, which contains o (M). The functional calculus defines VM = Nasa sym- 
metric operator, which satisfies N? = M. Therefore l 


(Mx, x)= (N?x, x) = (Nx, Nx) > 0. 


which shows that M is positive. 
(ii) Conversely, suppose that M is positive. According to theorem 4, the infimum 
of the spectrum of M is a = infy,)—)(x, Mx). Since M is positive, a > 0, which 


‘shows that o (M) > 0. D 


Corollary. Every positive symmetric operator has a positive symmetric square rool. 


Exercise 2. Show that a positive symmetric operator has only one positive square 
root. How many square roots does it have that are not positive? 


In theorem 1 of chapter 30 we showed that every compact operator has a polar 
decomposition: 


Every compact operator T can be factored as T = UA. where A is a positive 
symmetric operator, and U is an isometry on the range of A, and zero on the 
orthogonal complement of the range of A. 


The only place in the construction of the polar decomposition where we used 
compactness was in taking the square root of T*T. Now that we know how to take 


SPECTRAL RESOLUTION OF SYMMETRIC OPERATORS 361 


the square root of any positive symmetric operator. we can remove compactness and 
assert: 


Every bounded operator in Hilbert space has a polar decomposition. 


31.3 SPECTRAL RESOLUTION OF SYMMETRIC OPERATORS 


According to the Riesz representation theorem, every bounded linear functional ¢ 
on the space of continuous functions f defined on the compact space o(M) can be 
described as the integral of f with respect to a uniquely determined measure m(S) 
defined on the Borel subsets S of o (M), whose total variation is finite. We use now 
the functional calculus described in theorem 6 to construct the functionals 


Lesl f) = (f Mx. y), l (17) 


defined for every pair of points x, y in H. By the Riesz representation theorem there 
is a complex measure m, uniquely determined, such that 


(f(M)x. y)= | f(A)dmy.y. (18) 


Since €,,, depends on x and y, so does the measure mm: the manner of the dependence 
of m on x and y reflect the manner of dependence of £ on x. y. We list them in 


Theorem 8. Let my. be the measure on a (M) defined by (18). 


(i) my,» depends sesquilinearly on x and y, that is, linearly on x, skew linearly 
on y. 
(ii) myy is skew symmetric in X, y | Myx = Myy- 
(iii) Total var my y < xlii. 


(iv) The measures my x are real and nonnegative. 


Proof. Clearly, by (17), x.y is linear in x, skew linear in y. Since mm is uniquely 
determined by £, so is the measure mm representing £: 


Myy = My, y t My ; arar: caia ON a 


since both measures represent £v+z.y. This proves (i). 

(ii) Since f(M) is symmetric, the left side of (18) is skew symmetric in x. y: 
using uniqueness of the representing measure. we conclude that m depends skew 
symmetrically on x. y. 

(iii) According to the Riesz theorem. the total variation of the representing mea- 
sure equals the norm of the functional £. Using the Schwarz inequality and (16°). we 


362 SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS 


conclude that 


xy PL SIEM) Ss CMD IN Wael = IF lax lla Ul byl 


This shows that |£xy| < [||| lly], so Gii) follows. 
- (iv) According to part (iii) of theorem 6, for f real the spectrum of /(M) is 
f(e (M)). So for positive functions f, f(M) is a symmetric operator whose spec- 
"trum is positive. Therefore according to theorem 7, f (M) is a positive operator. This 


shows that the linear functional 2, x (f) = (f (M)x, x) is positive; but then so is the 
measure 11 x,y persone it. Oo 


_Acconding-to theorem 8, ata, £S),-S.a Borel-subset of-o-(M),-is a bounded.skew--- 
symmetric sesquilinear functional of x and y. We conclude from theorem 1 that for 
each $ there is a bounded symmetric operator E($) such that 


My y(S) = (E(S)x,¥). (20) 
This family of operators has the following properties: 


Theorem 9. Let E(S) be the family of maps defined by (20), where mx. x(S) is de- 
fined by (18). 


(i) E*(S) =E(S). 
(ii) JEGO) < 1. 
(iii) E(Ø) = 0, E(o(M)) = 1. 
(iv) IFSOT =@, E(SUT) = E(S) + E(T). 
(v) Each E(S) commutes with M. 
(vi) ESOT) = E(VS)E(T). 
(vii) Each E(S) is an orthogonal projection. If S and T are disjoint, the range of 
E(S) and E(T) are orthogonal. 
(viii) All orthogonal projections E(S), E(T) commute. 


Proof. (i) is part of theorem 1. Part (ii) follows from part (iii) of theorem 8. 
(iii) Since mx, y (ø) = 0, it follows from (18) that E(S) = 0. On the other hand, 
setting f(A) = 1, f(M) = I in (18) gives for all x and y, 


(x.y) = | dmy. y = (E(a(M))x. y). 
a(M) 
which means that E(o(M)) = I. 


Part (iv) follows from the additivity of the measure m x.x To show (v). we note 
that since M commutes with f (M), and is symmetric, 


(F M)Mx, y) = (M f(M)x, y) = (f(M)x, My). 


—mtegral On the ight converges in that topology. This can be done in the standard ~ 


SPECTRAL RESOLUTION OF SYMMETRIC OPERATORS 363 


The functional of f on the left is represented by the measure mmx.y; the functional 
on the right by the measure my, jy. Since the functionals are the same, so are the 
measures: 


MMe, y = Myr My- 
Setting this into (20), and using once more the symmetry of M, gives 
(E(S)Mx, y) = (E(S)x, My) = (ME(S)x, y). 


Since this holds for all x and y, E(S)M = ME(S), as claimed in (v). 

We postpone the proof of (vi) to section 31.5. 

(vii) Setting S = T in (vi) shows that E(S) = E? (S), namely that E(S) is an idem- 
potent. The geometric expression of this algebraic fact is that E(S) is a projection. 
Since, by part (i), E is symmetric, E(S) is an orthogonal projection. It follows from 
(iii) and (vi) that if S and T are disjoint, the range of E(S) and E(T) are orthogonal. 

Part (viii) results from interchanging S and T in (vi). o 


The family of operators E(S) is an orthogonal projection-valued measure. 
Exercise 3. (a) Show that E(S) is countably additive in the strong topology. (Hint: 
Use the orthogonality of the ranges of E(S) and E(T) when S and T are disjoint.) 

(b) Show that E(S) is not countably additive in the norm topology. 

We summarize: 

Theorem 9’. H is a Hilbert space M : H —> H a bounded, linear symmetric op- 


erator. Then there is a uniquely determined orthogonal projection-valued measure E 
on the spectrum of M such that E(S N T) = E(S)E(T) and 


fM = | FOJE (21) 
a(M) 


for all continuous functions f on o(M). The integral exists in the norm topology. 


Proof. The meaning of (18) and (20) is that (21) holds in the weak topology. To 
show that it holds in the norm topology, it suffices to show that the Riemann-Stieltjes 


fashion, combined with the estimate 


| rezu,| < max |a;|, . 


where Ul; = o(M) is a decomposition of ø (M) into a finite number of disjoint 
pieces /;. The estimate follows from the orthogonality of the ranges of E(/;). The 
uniqueness of E(S) follows from the uniqueness of the scalar measures (18). g 


364 SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS 


Take =I, and f(a) =a: 


oe f AJE. M= f ade: (22) 
ss Se SE gS H o(M) g o(M) 


(22) is called the spectral resolution of M. 


—314-ABSOLUTELY_ CONTINUOUS, SINGULAR, AND- POINT SPECTRA =- - 


We give now an important further refinement of the spectral resolution. According 
to the Lebesgue decomposition theorem for measures, any measure on R can be de- 


Composed asthe sum of a point measure. supported on a denumerable set, a singular 
measure, Supported on a set of Lebesgue méasure zero. and an absolutely continuous 
measure with respect to Lebesgue measure. 

We apply this to the measures my, y = (Ex. y): 


Myy = my? + fies + ae : (23) 


From the uniqueness of the Lebesgue decomposition we conclude that all three mea- 
sures on the right in (23) depend linearly on x. skew linearly on y. For any S these 
sesquilinear functionals can be represented as 


mS) = (EP S)x, y), meh (S) =(E“x v) and m.(S) = (EM yy), 


These bounded, symmetric operators E'”), E, E have all properties listed in 
theorems 9; that is. each family is a projection-valued measure, each orthogonal to 
the other. 

Denote the ranges of E(P (o (M)), E“)(o(M)), and E (o (M)) as H, HO, 
and H'*): they are called the point. singular, and absolutely continuous subspace of 
H with respect to the operator M. Clearly, 


HW BH) a) H™ =H. 
31.5 THE SPECTRAL REPRESENTATION 
OF SYMMETRIC OPERATORS 


Spectral representation is an infinite dimensional analogue of the diagonal form of 
symmetric matrices. 


Theorem 10. For any vector x in the Hilbert space H, and any continuous func- 
tion f, 


IEM = [rcaPame.. (24) 


THE SPECTRAL REPRESENTATION OF SYMMETRIC OPERATORS 365 


where the measure My x is the representing measure appearing in formula (18): ` 


(f(MD)x. vy) = | f(z)dmy.y. (18) 


Proof. For real-valued f we use the symmetry of f(M) to get |f Dx]? = 
(f Dx, f(MDx) = (f7(M)x, x) = f f?(A)dimy.x, where in the last steps we used 
(18) with f? in place of f. 

For complex-valued function f = g +ih, similar manipulations lead to (24). ° 


For any given x denote by Jy the set of elements z of form z = f(M)x, f any 
complex-valued continous function on o (M). Clearly, Jy is an invariant subspace of 
H, that is Jy is mapped into itself by M. We say that the vector z in Jy is represented 
by the function f. Relation (24) shows that this representation is an isometry when 
the functions f are normed by the L? (mx, x)-norm. . 

If z = f(M)x is represented by f(A), Mz is represented by Àf (À). 

Denote by Ky the closure of Jy in the Hilbert space H. It follows from isom- 
etry that every element z in Ky can be represented isometrically by a function h 
in L*(mx.x), and that Mz belongs to Ky, represented by à (A). Conversely, every 
function h in PGE) represents some element of Ky. This is called a spectral 
representation of M acting on Ky. 

We recall from chapter 7, section 7.1, that a measure 7 is called absolutely con- 
tinuous with respect to another measure m if every set that has m-measure zero has 
n-measure zero. Two measures are called equivalent if each is absolutely continuous 
with respect to the other. According to the Radon-Nikodym theorem, every measure 
n absolutely continuous with respect to m can be represented as dn = gdm, where 
g is a nonzero positive function integrable with respect to m. 

It follows that if we have a spectral representation of a symmetric operator M 
as L*(m), then we can get a spectral representation of M as L7(n), n any measure 
equivalent with m. If z is represented by h in L7(m), it is represented by hg? “in 
L? (n). 

The space Ky, on which we have constructed a spectral representation is, in gen- 


` eral, not all of H but only a closed invariant subspace. We appeal now to a result 


contained in 


__ Exercise 4. Let M be a symmetric operator acting on a Hilbert space H, Kan in- 


variant t subspace of M. Show that the orthogonal complement of K also is invariant 
under M. 


Suppose now that Ky above is a proper subspace H. Choose any vector y orthog- 
onal to Ky; according to exercise 4. so is My. It follows that for any polynomial 
q (à). q (M)y is orthogonal to Kx. Denote by Ky the closure of the set of vectors 
qD; y. Clearly, Ky is a closed invariant subspace of H, represented spectrally by 
L? (my,y). Using Zorn’s lemma and the above construction. we deduce 


366 SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS 


Theorem 11. There exists a family {K j} of closed subspaces of H such that 


(i) The K; are pairwise orthogonal; and theyspan H: - aa 
J 


H=K,;@Ko@.... (25) 


(ii) Each.K ;.is invariant.under.M and is.spectrally.represented.as L? (m;)——.. EEE 


Such a collection of {K ;} is called a spectral representation of M on H. 
It.is easy to derive toa spectral representation of M its spectral resolution. 


For any measurable set S, define on each K j the Operator’ E(S) as follows: in the 


representation of Kj as Lm j), ECS) is multiplication by c;(A), the characteristic 
function of S, defined as 


ee 1 ifAinS 

"1 O otherwise. 
Exercise 5. Verify that E(S) as defined above is a spectral resolution of M, namely 
that the family {E(S)} is a projection-valued measure hat has all the noes listed 
in theorems 9 and9’, 


Although a given operator has many spectral representations, the spectral resolu- 
tion derived from each is the same, as explained in theorem 9”. 

We turn now to the last topic of this section: spectral multiplicity. To simplify 
the discussion, we take the case that the underlying Hilbert space H is separable. It 
follows that in any spectral representation (25) the family {K ;} is denumerable. 

For further simplification we assume that the spectrum of M is absolutely con- 
tinuous with respect to Lebesgue measure in the sense explained in section 31.4. By 
the Radon-Nikodym theorem, such a measure is equivalent to Lebesgue measure on a 
subset S of IR. Under this assumption the measures m ; entering the spectral represen- 
tation of theorem 11 can be taken to be Lebesgue measure over sets S j that support 
the measure, where the support of m is the union of points A that have the property 
that the restriction of m to any open interval containing A has positive measure. 

The following are easily verified: if {m j} is a denumerable collection of measures, 
the support of 2m j contains the union of the supports of m j. 

Two measures are singular with respect to each other if the intersection of their 
support has Lebesgue measure zero. 


Lemma 12. Take any two spectral representations (25) and (25') of a bounded, 


symmetric operator M, with absolutely continuous spectrum, acting on a separable 
Hilbert space: 


H=K 6K 9... C1 a LYS!) (25) 


THE SPECTRAL REPRESENTATION OF SYMMETRIC OPERATORS 367 


We claim that up to sets of measure zero 
— / 


Proof. Take any index k; choose x to be that vector in Kj that is represented by 
the function = | in S. For any continuous function f, 


(f (M)x, x) = [ FAjdr. (26) 
k 


Denote by x the projection of x onto K ip and denote by g;(A) the function rep- 
resenting xi in L?(Si). Then x = Exi, and 


SMr = EFM =E f FOrlgjPar= f folga 28) 


where |g? = Zlgj?. 
Both (26) and (26’) represent the same bounded linear functional. Therefore the 
representing measures must be equal: 


Cs,dd = |g*|da. 


The support of the measure on the left is Sg, and the support of the measure on the 
right is contained in US’. Therefore Są C US‘. Reversing the role of (25) and (25°, 
we obtain lemma 12. g 


Definition. The spectral multiplicity of a point À in a spectral representation (25) 
Ecs; (À), is the number of sets 5; to which À belongs; Sj is the support of the jth 
measure m j in the representation (25). The spectral multiplicity of A can be zero, 
any natural number, or co. 


A spectral representation is far from nique, since its construction contains many 

-arbitrary choices. So it is far from clear that the spectral multiplicity function has 

an invariant meaning, the same for all spectral representations of a given bounded 
symmetric operator. According to a classical result of E. Hellinger: 


-——Theorem-13.-Let M-be-a-bounded;-symmetric operator ina separable-Hilbert space" °° > 


H, whose spectrum is absolutely continuous. Let H = K ® Kz ® -+ be any spec- 
tral decomposition of H for M, such that Kj + L*(S j) and the action of M is 
represented as multiplication by À. 

Spectral multiplicity, defined as Zcs j (A), is the same for all spectral representa- 
tions of M. 


Proof. We will use two simple operations for rearranging spectral decomposi- 
tions, splitting and combination. 


_K\ @ Ko @--: is represented spectrally by L?(m), m= Ðmj. 


"into a standard form, as follows: Decompose each set Sj as 


368 SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS 


Splitting. Suppose that a subspace K of H is represented spectrally by L?(m), m 
some measure on R. Split m as the sum m = Xm j of measures that are pairwise 
singular with respect to each other. Each L?(mj) represents some closed subspace 
Kj; of K; their direct sum K) @ K2 @--- isa spectral decomposition of K. 

Combination. This is the reversal of splitting. Suppose that {K j} is a collection 
of pairwise orthogonal subspaces of H. each represented spectrally by L?(m jx 
Suppose that the mj are pairwise singular with respect to each other. Then K = 


We will use splitting and combination to rearrange any spectral representation 


H=K\@K2.®--, Kj + L™(S;) 


n 
Sj = US}, 


where Sj is that subset of S; which belongs to exactly k of the sets Sı. Define the set 
M ] as 


Mı = Us} : (27) 


Clearly, Mj is the set of points that belong to exactly one set S;, namely points of 
spectral multiplicity one in.the spectral representation under discussion. Similarly 
we define M; as 


Me = uS: (27') 


Clearly, Mg is the set of points of multiplicity k. Moe is the set of points of infinite 
multiplicity. It follows from the construction that 


US; = UM; U Moo. (28) 


By splitting each Sj and recombining them into sets Mj. as described above, we can 
split and recombine the spectral representations K jo LS j) into spectral repre- 
sentations by L*(M,), k = 1,2.++-, 00. There is exactly one subspace represented 
by L7(M 1), call it H}. There are two orthogonal subspaces represented by L7(M3), 
whose direct sum we call H3, and so on. The subspaces H4 are orthogonal to each 
other; so we obtain the spectral decomposition 


H=H,GH.@---DHex, (29) 


where Hy, is represented spectrally by k copies of L7( Mı). We call (29) the standard 
form of the spectral representation (25). 

We claim that any other spectral representation (25') has the same standard form. 
Here is the proof: Denote the standard spectral representation obtained from (25') as 


H =H He 9 HL., (29') 


THE SPECTRAL REPRESENTATION OF SYMMETRIC OPERATORS 369 


where H; is represented by L? (M; ) ee BL? (M). We claim that M; = Mk aid 
H; = Hg. To see why, we shall construct vectors in H’, k > | that are orthogonal to 
H. We start with k = 2. 

Let x be the vector in H that is represented by the function = 1 in My in the stan- 
dard form (29). The vectors f(M).x span H). Denote by x3 the projection of x onto 
H? in the standard spectral representation (29; Denote by {g1, 22} the functions in 
Mj, representing x3. 

Define the functions hy and ha in M; as follows! 


l where g; = 0 1 where g3 =0 
hy = Y i ò2 
—8 elsewhere, 8, elsewhere. 
Let y be the vector in H, represented by {h;, h2}. We claim that y is orthogonal to 
all vectors of form f (M)x. Clearly, since y belongs to H, it is orthogonal to all 
components of f(M).x in the decomposition (29’) but the second component, So, 
using the definition of A, and ha, we get 


(x, f (M)x) = (y, f (M)x5) = ({Ay, ha}, FOAME, g2)) 
= f (higi +A f OAdA = 0 


The vector f (M) y is represented by {FOA)hi, f(A)h2)}. Therefore, using the defi- 
nition of A; and h2, we get 


Irany = f p FPU + haaa. 


Win 


The formulas for h, and h2 show that |/y 24 h2]? is positive on M3, and belongs 
to L! (M3 5). It follows that the closure of { f(M)y} has a spectral representation by 
L? (M3). Since y is orthogonal to Hj,*so are the vectors f(M)y, and so they all 
belong to H k the orthogonal complement of H}. The Hilbert space HÈ is invariant 
under M. The subspaces H2,..., H furnish a spectral representation of M on Hï = 
on the sets M2,... Moo. We appeal now to lemma 12, according to which the pointe 
set M3 belongs to the union of the sets Mg, k> entering the spectral representation 
(29) of M over the Hilbert space He. Therefore we conclude that Ms is contained 
in Ma U M3 U---U Moo. 

Using the same argument, we conclude that M3, +-+, Mbo all are contained in 
Mz UM3U---U Moo. Since the sets Mg are pairwise disjoint, it follows that the sets 
M.k > 1, are disjoint from M4. On the other hand, from lemma 12 applied to M 
acting on the whole space H, we conclude that 


UM, = UM. 


370 SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS 


Combining this with the disjointness of M} and M}. k > 1, we conclude that Mj is 


contained in M; . Reversing the role of the two spectral.representations,we-conclude—. 


that M i is contained in M}; therefore M = M4. 


To show that H; = Hj, we argue as in the proof of lemma 12. We leave it to the 


diligent reader to show that for all k, Mg = M; and Hy. = H}. o 


Exercise 6. Let (25) be a spectral representation for M,-K je L? (my) Sj the-sup-~ 
port of m j. Show that the closure of the union of .S; is the.spectrum-of-M. —----- --—--- 


Exercise 7. Give an example to show that the spectrum of M may contain a set of 


positive Lebesgue measure whose multiplicity is zero. 


Definition. Two bounded, symmetric operdtors M and N acting on a Hilbert space 
H are called unitarily equivalent if there is a unitary map U, that is, a one-to-one a 
norm-preserving map of H onto H, that carries M into N: 


N = UMuU-!. 


Theorem 14. Two bounded, symmetric operators M and N whose spectrum is abso- 
lutely continuous are unitarily equivalent if and only if they have the same spectral 
multiplicities. 


Proof. Tf M and N are unitarily equivalent, U carries the standard spectral repre- 
sentation of M into a standard spectral representation of N. Conversely, if the spec- 
tral multiplicity set M} and N; for the two operators M and N are the same, then the 
standard spectral representations for M and N furnish the unitary map U. o 


31.6 SPECTRAL RESOLUTION OF NORMAL OPERATORS 


A bounded, linear operator N mapping a Hilbert space into itself is called normal if 
it commutes with its adjoint: 


N*N = NN%, i (30) 


Clearly, every bounded symmetric operator is normal; as we will see in the next 
section, so is every unitary operator. 

The spectral resolution of normal operators is analogous to that of symmetric 
operators except that the spectrum of a normal operator may well contain complex 
numbers. 


Theorem 15. Let H be a Hilbert space, N : H — H anormal operator. Then there 
is an orthogonal projection-valued measure E on the Borel subsets of the spectrum 


SPECTRAL RESOLUTION OF NORMAL OPERATORS 371 


of N such that 


r= | dé, n= f AdE. (31) 
a(N) a(N) 


The integrals exist in the norm topology. 


Proof. As in theorem 9, we rely on a functional calculus for normal operators. 
We will make use of Gelfand’s theory of commutative 5* algebras, invented for this 
purpose; see chapters 18 and 19. 

Let q(&, n) be any polynomial in two real variables &, 7. We rewrite it as a poly- 
nomial Q in the complex variable ¢ = & + in and its conjugate ¢ = £ — in: 


q(E,n) = QG, 8). (32) 
We define the functional calculus by setting 
Q = Q(N,N*), (32X) 


Since N and N* commute, they commute with Q, and Q commutes with Q*. We 
need the following version of the spectral mapping theorem for normal operators: 


Lemma 16. For N a normal operator, and Q of form (32'), the spectrum of Q is of 
form 


o(Q)=Q(A,a),  àeoN). (33) 


Proof. The operators Q of form (32’) constitute a commutative algebra of opera- 
tors with a unit. Denote by F their closure in the operator norm; F is a commutative 
Banach algebra with a unit, and therefore enjoys all the benefits of Gelfand’s theory. 
We appeal in particular to theorem 14 of chapter 18, according to which the spectrum 
of any Q in F is of the form p(Q), where p is a homomorphism of F into C. For Q 
of form (32’), 


mmm p(Q) = QCN), P(N"). (34) 


According to theorem 16 of chapter 19, 


Seen ee ee ee E 


‘Set this into (34); since the numbers p(N) run through the spectrum of N as p runs 
through all homomorphisms, we obtain (33). Ey 


We appeal next to theorem 17 in chapter 19, that asserts that the norm of a normal 
operator Q is.equal to its. spectral.radius: 


Ql = le (Q). (35) 


con 
Re 


32 SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS 


Combining (35) with (33), we obtain the important result that for Q of form G2): 


IQI = Rai QOI) (36) 


We can now extend the functional calculus from polynomials to all continuous 
functions on the spectrum of o (N). Every continuous function f on o (N) can be 
approximated uniformly by a sequence g of polynomials; the relation (36) guarantees 
that the corresponding sequence of operators Q, converge in norm to an operator 
the symmetric case in'theorem 6. 

We can now proceéd as in our construction of a spectral resolution for normal 


in F. The resulting functional calculus has the analogues of all properties listed for ~~ 


Operators; see theorems 8.and.9. We leave details.to the reader... . n mend ee 


31.7 SPECTRAL RESOLUTION OF UNITARY OPERATORS 


Definition. A unitary operator U is a linear isometric one-to-one mapping of. a 
Hilbert space onto itself: 


{Ux |] = [lx |. (37) 
Exercise 8. Show that a unitary operator U preserves scalar products: 
(Ux. Uy) = (x, y). (37°) 
It follows from (37°) that 
(x. U*Uy) = (x. y); 
since this holds for all vectors x and y in H, U*U is the identity: U*U = I. Since, 
by definition, U is invertible—it is one-to-one and onto—it follows that U* is the 
inverse of U: 
U* = U7! (38) 
Exercise 9. Show that every operator U that satisfies (38) is unitary. 


Exercise 10. Show that the spectrum of a unitary operator lies on the unit circle. 


Exercise 11. Show that if M is a symmetric operator, and k any real number. 
U = (M + ikI)~!(M — AI) (39) 
is unitary. How about the converse? 


Exercise 12. Combine exercises 10 and 11 to show that the spectrum of a bounded 
symmetric operator is real. 


SPECTRAL RESOLUTION OF UNITARY OPERATORS 373 


Since every invertible operator commutes with its inverse, it follows from (38) 
that U and U* commute. This shows that every unitary operator U is normal; thus, 
according to theorem 15, U has a spectral resolution of form (31), only in this case 
the measure E is supported on the unit circle. Since unitary operators are by far the 
most important among normal operators, we give here a direct proof of this result. 

For any vector x in H, form the following doubly infinite sequence {ap}: 


` an = (U" x, x), n integer. — (40) 
We claim that this sequence is positive definite in the sense of chapter 14, section 4, 


Ss On—mPnOm Z (41) 


n.m 


for any finite set of complex numbers pn. To see this, we set (40) into (41) and use 
U`! =U": 


Larr dopr = DUO Oy 


yee ee > mU" y = |Z onU"x f >o. 


m 


Then according to Carathéodory’s theorem, theorem 7 of chapter l4, a, are the 
Fourier coefficients of a nonnegative measure on the unit circle: 


(U" x.x) sf ei"? dmy. (42) 


As indicated, the measure my. depends on x. Setting n = 0 in 2) shows that the 
total measure equals || xl. 

For any pair of vectors x, y, (U"x, y) can be expressed as a simple linear sano 
nation of (U” (x + y, x + y) and (U(x + iy), x + iy). Using (42), we can write 


(U" x. y) =f e"? dms, (43) 


where 


Amy y = My+y — My-y + iy tiv — iMy-iy k (4) 


Theorem 17. 


(i) The measures mx,y depend linearly on x, skew linearly on y. 
(ii) my.y is a skew symmetric function of x, y: Myx = Myy 
(iii) The total mass of my.y is < |x lilly]. 


374 SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS 


Proof. (i) Every measure is uniquely determined by its Fourier coefficients. It 
follows that since the left side of (43) depends linearly on x, so does the right side. 

(ii) is an immediate consequence of the definition (44) of the measure my _y. 

(ili) Take any set S on the unit circle; we claim that mx,y(S) is a scalar product 
in H. By (i), it is linear in x; by (ii), it is a skew symmetric function of x and y. 
Since my x(S) = m,(S), itis nonnegative. Therefore the Schwarz inequality can be 
applied: 


lm. »(S)| < Mx (S) mS). 


. ta 9 x a . 
We saw earlier that the total measure my is ||x||-. Since my is nonnegative, m,(S) < 


[TP Hy SY E y and 80 Yin CST SYK T T ar Sly a 


Note that theorem 17 is a literal analogue of theorem 8. 

Let S be any Bore] set on the unit circle. It follows from theorem 17 that mx, x(S) 
is a skew symmetric function of x and y, linear in x, and bounded by |jx|iiiyll. It 
follows then from theorem 1 that it can be represented as 


mx.y(S) = (Œ(5)x, y), l (45) 
E(S) a bounded, symmetric operator. We claim that this family of operators has the 
same properties as those listed in theorem 9. Here, for instance, is a demonstration 
of property (vi): 
(vi) E(S NT) = ECS)E(T). 
Proof. We set (45) into (43), obtaining 
(U"x, y) = f ei"? a(Ex, y). (46) 
Replace n byn+k: 


"t y, y) = f ei"? eiO Ex, y); ; (47) 


on the other hand, using (46) with x replaced by U} x, we can express the left side of 
(47) as 


i ed (EUS x, y). (48) 


Two measures on the unit circle that have the same Fourier coefficients are identical; 
therefore 


eit d (Ex, x) = d(EU* x, y). 


SPECTRAL RESOLUTION OF UNITARY OPERATORS 375 


Integrate this over the set S: 


/ cs (eX d(Ex, y) = (E(S)UFx, y), (49) 


where cs is the characteristic function of S. We can, using the symmetry of E(S), 
rewrite the right side of (49) as (Uk x, E(S)y); using formula (46), with k in place of 
n, we find that this is equal to 


i eit? d (Ex, E(S)y). (49') 


The measures on the left in (49) and (49’) have the same Fourier coefficients; there- 
fore they are identical: 


csd (Ex, y) = d(Ex, E(S)y). 


Integrate both sides over any other Borel set T: 


J eresas, y) = (E(7)x, E(S)y). 


Since crcs = csnr, the left side is (E(S N T)x, y). Using the symmetry of E(S), 
we find that the right side is (E(S)E(T)z, y). Since they are equal for all x and y, 


E(S NT) = EG)E(T), 


as asserted in (vi). l g 


We leave it to the reader to verify the rest of the properties listed in theorem 9. 
Setting (45) into (43), we obtain 


(U"x, y) a f aex, y), 
-which is the weak version of 


U' = f eM AE. (50) 


. As-in-theorem.9, the integral on the right exist in the norm topology. This is the 
spectral resolution of unitary operators. 


HISTORICAL NOTE. The spectral resolution of bounded symmetric operators is due 
to Hilbert. The spectral representation, and theory of spectral multiplicity, is the work 
of Ernest Hellinger, (1883-1950), a student of Hilbert. He was professor of mathe- 
matics at the University of Frankfurt until his dismissal by the Nazis. In the infamous 
anti-Jewish pogrom in 1938, dubbed “‘Krystallnacht,” he was taken to the dreaded 


376 SPECTRAL THEORY OF SYMMETRIC, NORMAL, AND UNITARY OPERATORS 


Dachau concentration camp. Miraculously, he was released: he found a new home in 
__ the United States, teaching mathematics at Northwestern University. 


BIBLIOGRAPHY 


Gelfand. L M. Normierte Ringe. Mai. Sbornik, N.S. (51). 9 (1941): 3-24. 


Halmos. P. Introduction ta Hilbert Space and the Theory of Spectral Multiplicity. Chelsea Publishing, 
New York. 1951. 


Hellinger, E. Neue Begründung der Theorie quadratischen Formen von unendlichvielen Veränderlichen. 
J. Mat.. 136 (1909): 210- 271. 


————Hitben.-B-Grandatige-ciner-alizemeinen-Theorie-der tinearen-Integralgleichungen. Nachr-Akad: Wiss. =-= - 
Gértingen, Math.—Phvs., K] (1906): 157-227. 


Riesz. F. and Sz. Nagy, B. Leçons d'analyse fonctionelle. Akadémiai Kiadó, Budapest, 1952. 


Stone, M. H. Linear Transformations in Hilbert Space and their Applications to Analysis. AMS Collo- 
quium Publications, 15. American Mathematical Society. New York. 1932. 


Sz. Nagy, B. Specrraldarstellung linearer Transformationen des Hilbertschen Raumes. Ergebnisse der 
Math.. 5. Springer, Berlin, 1942. 


Wintner. A. Spectralthearie der unendlichen Matricen. Hirzel. Leipzig, 1929. 


a2 


SPECTRAL THEORY OF 
SELF-ADJOINT OPERATORS 


In this chapter we present the spectral theory of unbounded self-adjoint operators. 
We start with the observation, due to Hellinger and Toeplitz, that an operator M 
that is defined everywhere on a Hilbert space’ H and is its own adjoint, 


(Mx, y) = (x, My), (1) 
is necessarily bounded. 

We show first that M is a closed operator. Suppose that x, is a convergent se- 
quence, Xn — x, and that Mx, converges to some vector u. Setting x = x, into (1), 
we get 

(Min. Y) = (xn. My), 


and in the limit we get 


(u, y) = (x, My). 


By (1), the right side equals (Mx, y). Since this holds forall y, u=. Mx. This proves , 


that M is closed. But then, according to the closed graph theorem, theorem 12 of 
chapter 15, M is bounded. 
It follows from this observation that unbounded operators that are their own ad- 


___joints can be defined only-on-a subspace of Hilbert space_Here.is the precise. defini-.... g 


tion, due.tọ. von Neumann: .. .... .- 


Definition. Let H be a complex Hilbert space, D a dense subspace of H, and A a 
linear operator defined on D. The adjoint A* of A is the operator whose domain D* 
consists of all vectors v in H for which there is a vector denoted as A*v in H such 
that 


(Au, v) = (u, A*v) (2) 


378 SPECTRAL THEORY OF SELF-ADJOINT OPERATORS 


holds for all u in D. Since D is dense, for any given v there can be only one such 
o vector A*v. Clearly, D* is a linear subspace of H, and A* is a linear operator on DA 
` A is called self-adjoint if D* = D and A* =A. 


32.1 SPECTRAL RESOLUTION PUES 


The main result to which this chapter is devoted is the spectral resolution of self- 
adjoint operators: 


Theorem 1. Let A be a self-adjoint operator in a Hilbert space H; denote the do- 
maino Acby-D:-Fherets-a-spectratresotution-for-A; thats; orthogonat projectton=——— 
valued measure E defined for all Borel measurable subsets of R, with the following 

properties: 


(i) E@) =0,EQ)=L 
(ii) For any pair of measurable sets S and T, ES NT) = E(S)E(T). 
(iii) For every measurable set S, E* (S) = E(S). 
(iv) E commutes with A, that is, for any measurable set S, E(S) maps the domain 
D of A into D, and for all u in D, AE(S)u = E(S)Au. 
(v) The domain D of A consists of all vectors u for-which 


f Pawon, u) < oo (3) 
and 


Au = fieou. (4) 


There are a number of proofs known for this important result an extension of * 
theorem 9 of chapter 31. The first one ever was given by von Neumann, and we 
will sketch it in section 32.2. Yet another approach will be indicated in section 32.3. 
A proof due to Marshall Stone will be outlined in chapter 34 on semigroups. The 
beautiful proof that we present here in all its details is due to Doob and Koopman. 

According to the Herglotz-Riesz theorem, theorem 6 of chapter 11, every analytic 
function f(¢) in the unit disk, |¢| < 1 whose real part is positive, can be expressed 
uniquely as 


i ef +t 

f)=ic+ | = dm(6), (5) 
elf — l4 

where m is a nonnegative measure of finite total mass on the unit circle and c is real. 

We can change the scene from the unit disk to the upper half-plane and obtain the 

following variant: 


SPECTRAL RESOLUTION 379 


Theorem 2. Every analytic function g(z) in the upper half-plane whose imaginary 
part is positive can be expressed uniquely in the form 


+t 


ge) =a+me+ | EE raed ~ ds(t), (5’) 


where s is a nonnegative measure of finite total mass on R, a is real and m nonneg- 
ative. 


Proof. The transformation 


ie jibe 


maps conformally the unit disk onto the upper half-plane with the point ¢ = 1 going 
to co. The function f(¢) and g(z) described above are related to each other by 


“g(z) = if (2). 


Using the inverse of the transformation (6), 


Lemar, ~ (6) 
and representation (5) for f, we can write 


ibra; Dai 
omiso eti f aeea 
0, a 
=c +1 10/2(z + i) — e7i8/2(z — i) 


EP J z cos 0/2 — sin 0/2 Re 
a icos@/2 + izsin@/2 


dm(@) 


dm 


zcotan 6/2 + 1 z+1 
= —C J m= a+ mz +5 A 
In the last step we renamed the variables: a = —c, t = — cotan 0/2, ds (t) = dm (0), 


and m = m(0), the-mass located at 0 = 0; > mtn m aaa 
The next result is a sharpening of theorem 2: 


Theorem 3 (Nevanlinna). Every analytic function g(z) in the upper half-plane 
Imz > 0 whose imaginary part is positive and satisfies the growth condition 


lim sup ylg(iy)| < co (7) 
y—> 00 


380 SPECTRAL THEORY OF SELF-ADJOINT OPERATORS 


can be expressed uniquely in the form 


d 
SG) | (8) 


tor 


= where n is a nonnegative measure of finite total mass on R. Furthermore 


yar 


ages Fu eee La n(R) = lim yImg(iy). (9) 


Proof. Since g has positive imaginary part in the upper half plane, it can be written 
in the form (5’). Set z = iv and state the boundedness condition (8) for the real and 
the imaginary part separately: 


4 


ee ] i 
lim sup yla— | 1 ids| <M, (7a) 
y= oc Trays 
and 
i+? 
lim sup y (r + | oa] <M. (7b) 
17+ y7 


The integral on the left in (7b) tends to 0 as y tends to oc; it follows therefore that 
the nonnegative quantity m must be zero. Thus (7b) asserts that 


R] 


Wr 2 
lim su - (l+t)ds(t)< M. 
pf as T ) (t) < 


Taking the limit as y —> oo of the integral on the left, we conclude that 


fo +17)ds(t) < M. 
We define 
dn(t) = (1 +1") ds(t): (10) 


it follows that n is a nonnegative measure whose total mass is < M. It follows from 
(7a) that . 


l 2 
: yo =A 
a= Jim. f z goa = friso 


Setting this value of a into (5') and setting m = 0, we obtain 


+r ` +r? dn(t) ` 
w= f (7+) d= f — as = | ZP. 


Uniqueness of the measure follows from uniqueness of the measure in (5). Relation 
(9) follows from (8). O 


SPECTRAL RESOLUTION 381 


Given any measure 7 on R, real or complex, of finite total mass, formula (8) 
defines a pair of analytic functions g in the upper and lower half-plane; g is called 
the Cauchy transform of the measure n. 


Lemma 4. The Cauchy transform (8) is one-to-one, i.e. a complex measure n of 
finite mass is uniquely determined by its Cauchy transform. 


Proof. We have to show that if g(z) defined by (8) is zero for all nonreal z then 
n = 0. Replace z in (8) by = and take the complex conjugate. We get 


z@ =0= f dn 


This shows that if the Cauchy transform of n is zero, so are the Cauchy transform 
of its real and imaginary part. The real part of n can be decomposed into its positive 
and negative parts: 


Ren = n4 — nm, 


ną and n— nonnegative measures. It follows that n+ and n— have the same Cauchy 
transforms. Therefore by the uniqueness part of theorem 3, n} = n_, and so 
Ren = 0. Analogously Imn = 0. g 


Definition. A number z belongs to the resolvent set of A iff A — zI maps D one-to- 
one onto H. 


Theorem 5. Let H be a complex Hilbert space, A a self-adjoint operator acting in H. 
All nonreal complex numbers z belong to the resolvent set A. 


Proof. We show first that the range of A — zI is a closed subspace of H. The range 
consists of all vectors u of the form 


Av =v =l, vin D. 
Take the scalar product of both sides with v: 


(Av, v) — z(v, v) = (u, v). 


` Since A is symmetric, (Av, v) is real, so the imaginary part on the left side is 


—Im z]|v||*. Using the Schwarz inequality on the right we deduce that 
[Im z{llull? < ullo, 


which implies that 


l 
lell < = 
Imz] 


lall. -ON 


382, SPECTRAL THEORY OF SELF-ADJOINT OPERATORS 


Let un be a sequence of vectors in the range of A—<I that converges to some vector u: 


AUn — EUn = Un. 


It follows from inequality (11) that ||un — vm || < 1/lIm z]|ļun — un ll. Therefore also 

the v, converge to some limit v. We claim that this limit v is in D. To see this we take 

the limit of the above relation as n —> oo. The right side tends to u, and the second 
“termon the left tendsto —zv. Therefore thefirst term Bin also ténds to a limit, call 
E EN D ha L 


'r=zv=u. (12) 


-Now-take-the: satar product of Akoy_with-amy-vector w- m-D; and use the set mm 
adjointness of A: 


(Aun, w) = (vn, Aw). 
Take the limit as n —> co: 
(r, w) = (v, Aw). 


By definition of self-adjointness, this shows that v belongs to the domain D of A, 
and that Av = r. Combined with (12), this shows that u belongs to the range of 
A — zI, so the range is closed. l 

If the range of A — zI were not all of H, there would be a nonzero vector k in H 
orthogonal to it: 


(Av — zv, k) = (Av, k) — (v, zk) = 0 
for all v in D. By definition of self-adjointness, it follows that k belongs to D, and 
Ak = Zk. But then (k, Ak) = z(k, k) is not real, contrary to the symmetry of A. 
This completes the proof that A — zI maps D onto H. That it is one-to-one follows 


as above, for otherwise some k in D would be mapped into 0 by A — zI, contrary to 
(11). 0 


We denote the resolvent of A by 
R(z) = (A = 21)7! 
It follows from (11) that for z nonreal 
IRGI < Imz. (13) 
Corollary 1. R(z) is an analytic function of z on the resolvent set of A. 
Proof. Choose any vector u in H, and denote R(z)u as v(z). By definition of R, 


(A —z)v(z) =u. 


SPECTRAL RESOLUTION 


Similarly 


(A—(z+h))v(z+h) =u. 


Subtract these two equations and divide by h, 
oe v(z+ wi — v(z) EET A 
which is the same as 
were = R(z)u(ZzZ + h) = R]@)RE + h)u. 


Using the estimate (13), we conclude that v depends Lipschitz continuously on z. 
Letting h —> 0, we deduce that v(z) is differentiable in the complex plane, so R(z) 
is holomorphic in the strong topology. 


o 
We claim that the adjoint of R(z) is R(z). To see this, choose any two vectors u 
and w in H. Denoting R(z)u = v, (A — z)v = u, we have 


(u, R@)w) = (A — z)v, R@)w) = (v, (A -ZRE@)w) = (v, w) = (Riu, w). O 


For any u in H we define the complex valued function g(z) for z nonreal as 
follows: 


e(z) = (R(z)u, u). 


(14) 
To indicate its dependence on u, we will write g(z) = g,(z) when necessary. 


Lemma 6. For any u in H, g defined by (14) has these properties: 


(i) g is an analytic function of z in the upper half-plane Imz > 0, and its imag- 
inary part there is nonnegative: 
TË ylety) < lull”. 

(iii) limy—oo y Im gu(iy) = |Jull?. 

(iv) g(@) = g(z)- 


Proof. The analyticity of g follows t 
by v; then 


he analytic character of R(z). Denote R(z)u 
u=(A—z)v = Av — zv. (15) 
Take the scalar product of (15) with v; since by (14) g = (v, u), we get 

e(z) = (v, u) = (v, Av — zv) = (v, Av) — Z(v, v). (16) 


383 


384 l SPECTRAL THEORY OF SELF-ADJOINT OPERATORS 


Since A is self-adjoint (v, Av) = (Av, v) = (v, Av) is real, so we deduce from (16) 
that 


Img(c) = y(u, v), y SIM. aa aa (17). 


This proves that the imaginary part of g is positive in the upper half plane, as claimed 
in (i). 


(ii) By the Schwarz inequality, Rv. AA E EE T aa 


leu (z)| = Reu, «| < REl < IREO 


using inequality (13) for*||R(z) ||, we obtain (ii). 


____ (iii) Take the scalar product of (15) with u: since by (14) g, = (v. u). We get = 
o MÊ = Avu) = sgu). 
Set z = iy and take the real part of this relation: 
lul? = Re (Av, u) + yIm gy (iy). (18) 


_ To complete the proof of (iii), we have to show that the first term on the right in 
(18) tends to zero as y tends to oc. To see this. we first estimate this term by the 
Schwarz inequality: l i 


[Re (Av, u) < (Av, u)| < Avila. 
To see that ||Av]| tends to zero, we write 
Av = AR(z)u = (1+ zR(z))u. 
Setting z = iy and using inequality (13), we get the estimate 
| JARGy)I < 1+ PIRG <2. (19) 


This shows that the operators AR (iy) are uniformly bounded. Clearly, it suffices to 
show that AR(/y)u tends to zero for a set of u dense in H. D is such a set, for then 
by (13) 


JAR Cy ul] = [RGAl] < IRGO Au] < Aull/y. 
(iv) Since R*(z) = R), we have 
g(Z) = (R(Z)u, u) = (R*(z)u. u) = (u, R(z)u) = (Rez)u, u) = g(2). 0 
Lemma 6 shows that for any u in H the function g(z) defined by (14) satisfies the 


hypotheses of theorem 3. Therefore we conclude that for Im z positive g(z) can be 
represented in form (8): 


dn(t) 


ans 


(R(zju.u) = i (20) 


SPECTRAL RESOLUTION l 385 


The nonnegative measure n depends on the vector u, which we indicate as n = Ny. 
It follows from part (iv) that the representation (20) holds in the lower half-plane as 
well. 

We deduce from relations (9) and part (iii) of lemma 6 that 


ny(R) = lu? (21) 


For any pair of vectors u and v we can express (R(z)u, v) as a linear combination 
of (R(z)(u + v), (u + iv)) and (R(z)(u + iv), (u + iv)). This leads to an integral 
representation of (R(z)u, v) as a Cauchy transform: 


diny.y- D 


(R(z)u, v) = f ; à 


The measure nu.y has a simple expression in terms of nu+y and nytiy; See formula 
(44) in chapter 31, section 31.7. 
The properties of the measures ny,y are summarized in 


Lemma 7. 


(i) huu =u. 

(ii) My.» depends linearly on u, skew-linearly on v. 
(iii) nu.» is a skew-symmetric function of u and V: Ny u=Ay.y- 
(iv) The total variation of ny,» is < \lullilvll. 


The proof is identical to the one given for lemma 7 in chapter 31, section 31.7: it 
is based on the simple explicit expression for yy. a 


We appeal now to theorem 1 in chapter 31. A bounded, skew symmetric. skew 
bilinear functional b(u, v) in a Hilbert space H can be represented uniquely as 


b(u, v) = (Eu, v), 


where E is.a bounded symmetric operator acting on H. Lemma 7 says that for any set 
S. (S)y.y is such a functional; therefore there exist bounded, symmetric operators 
E(S) such that i 


Ny a(S) = (E(S)u, V) a PELEAN (23) 


“Setting this into (21), we obtain ~ 


(R(z)u, v) = | 7 : - 


We claim that the operators E defined by-(23). furnish the spectral resolution for A, 
meaning that they have the properties proposed in theorem 1 at the beginning of this 
chapter. 


d(Eu, v). (24) 


386 SPECTRAL THEORY OF SELF-ADJOINT OPERATORS 


Proof, (i) Since by (23), n(@)y.» = 0 for all u, v, so is (E(B)u, v); this makes 


(E(R)u, u) = nu (R) = lull? = (usw). es we 


It is easy to deduce from this that E(R) is the identity. 

(ii) We show first that the operators R(z) and E commute. To see this, we start 
with the fact that for arbitrary nonreal complex numbers z and w the operators R(z)- =" > 
and R(w) commute. Therefore for any pair of vectors uand vin H, =-=- = — =- -—— 


(R(w)R(z)u, v) = (R(z)R(w)u, v). (25) 
We-use-the adjoint of R(w)-te-rewrite the-left-side-of (25)-Employingthemtherep——— 
resentation (24) with R*(w)v in place of v yields 


(R(z)u, R*(w)v) = | an, Rw) = | IRn, v). (26) 


We rewrite the right side of (25) by employing the representation (24) with R(w)u 
in place of u. We get 


/ a ERWw)u, v). (26') 


rz 


‘The Cauchy transforms (26) and (26 ) are.identical functions of z. Therefore, by 
lemma 4, the representing measures are identical: 


(R(wW)E(S)u, v) = (E(S)R(w)u. v) 
for every measurable set S. Since this holds for arbitrary u and v, it follows that 
R(w)E(S) = E(S)R(w) 


for arbitrary set S and any nonreal complex number w. 
To show that E(S N T) = E(S)E(T), we use the resolvent identity 


R(z) — R(w) 


z-w 


R(z)R(w) = 


to rewrite (25); using (24) twice for both z and w yields for the right side of (25), 


i l - =) atu. = f ()( : ) acu,» (27) 
Baw \t=z t~w t—=z/ -w 


We compare now (26) and (27); appealing once more to lemma 4, we conclude that 
the measures appearing in these two formulas are the same. Therefore for any mea- 
surable set S, . 


cs(t) 
t—w 


(R(WE(S)u,v) = / d(Eu, v). (28) 


SPECTRAL RESOLUTION 387 


where c,(t) is the characteristic function of the set S. We use now formula (24), with 
w in place of z and E(S)u in place of x, to rewrite the left side of (28) as 


f am, v). (28') 
P= Ww 


We compare (28’) with the right side of (28); we appeal once more to lemma 4 to 
conclude that the measures appearing in these formulas are identical: for any mea- 
surable set T, 


(E(T)E(S)u, v) = i es(t)d(Eu, v) = (E(SN T)u, v). 
T 


Since this holds for arbitrary vectors u and v, it follows that E(T)E(S) = E(S NA T), 
as asserted in (ii). ; 

(iii) The symmetry of E follows from the skew symmetry of ny, v (S). 

(iv) We have already shown that E commutes with R = R(z). To show that it 
commutes with A, let v be any vector in the domain ofA; then v can be written in 
the form v = Ru, u in H. Using the identity AR = I + zR, we get 


EAv = EARu = Eu + zERu. 
Similarly, since E and R commute, we can use the same identity to write 
AEv = AERu = AR(Eu) = Eu + zREu = Eu + zERu. 


Comparing the last two identities we conclude that EAv = AEv, and that E maps D 
into D. 

(v) Suppose that v belongs to the domain D of A. Then, since R(z) maps H onto 
D, we can write v as v = R(z)u. It is convenient to choose z = i. Let T denote any 
measurable set. Using previously established properties of E and R and the resolvent 
identity, we can derive the following string of identities: 


(E(T)v, v) = eee Ru) = (RYRE(T)u, u) = a DRODETu, uy ~ a aie 


= LRO — R(-DJE(T)u, u) 


This proves that 


d(Ev, v) = 


v = R(z)u, 


388 : SPECTRAL THEORY OF SELF-ADJOINT OPERATORS 
‘ed Aces E o | 
(G+ 1")d(Eu, v) = d (Eu, u). 

It follows that as asserted in (3), for v in D, 
[ Pace. v) < ©. (3) 


Next we show that for v in D the Riemann integral 


~ PE 


with respect to the vector-valued Ev measure converges. To see this, consider any 
finite interval S of R, and any decomposition of S = US; into disjoint subintervals, 
each of length less than 1. A Riemann sum corresponding to this decomposition is 


> GESS yy. tjeSj. (4°) 


where vj abbreviates E(S;)v. It follows from property (ii) that the vectors v; are 
pairwise orthogonal, so the norm square of the Riemann sum (4’) is 


a 2 
Doty’. 


Since this is a Riemann sum for the integral (3), it is bounded uniformly for all 
decompositions. The convergence of the integral (4) follows. 

To determine the value of this integral, we take any vector w in H. Let T be any 
measurable set. Using previously established properties of E and R, we can derive 
the following string of identities: 


(E(T)v, w) = (E(T)Ru. w) = (RE(T)u, w) 
1 
= | eer u.w = f : d(Eu, w). 
-z T 


This proves that 


1 


t—< 


d(Ev, w) = 


d(Eu, w), 
and therefore 

(t — z)d(Ev. w) = d(Eu. w). 
Setting u = (A — z)v on the right above gives 


td(Eu. w) = d(EAv, w). 


SPECTRAL RESOLUTION USING THE CAYLEY TRANSFORM 389 


Integrating this relation over R, we deduce that 
f s0be, w) = (av, w). 


The left side is scalar product of the integral (4) with w; since w is arbitrary, it follows 


that 
fiw = Av. 


This completes the proof of theorem 1. -J 


Corollary. [fx belongs to the domain of A", then for every v in H, 


(A*x, v) = | taws, v), k<m. 7 (29) ` 


Exercise 1. Prove relation (29). 


We sketch now two other approaches to constructing a spectral resolution of a 
self-adjoint operator. 


32.2 SPECTRAL RESOLUTION USING THE CAYLEY TRANSFORM 


We present now von Neumann’s original approach employing the Cayley transform 
of A: 


U=(A-i)(Ati)T!. (30) 


We urge the reader to recall from chapter 31, section 31.7, the notion of a unitary 
operator. 


Theorem 8. The operator U is unitary, that is, a norm preserving mapping of H 
onto H. 


Proof, Since the operators A + i map_D(A) one-to-one onto A.U maps H onto ŽŻŽ ć — 


itself. We claim that U is norm preserving. To see this, let u be any vector in H;. 
denote by v and w the vectors 


ye (Atiyg ne, w = Uu. 
Then 


(A+tijv=u. (A-—t)v=w. 


390 SPECTRAL THEORY OF SELF-ADJOINT OPERATORS 


Taking scalar products and using the symmetry of A, we get 
lul? = (A+), (A+i)v) = Avl +v’, Av)—(Av, v)] = Avl? +lul?, 
and similarly 


__ wl? = (A = iv, (A — ibo) = Av]? + loll’. 


This proves that U is niorm preserving. ~ ~~ ` ë B 0 


Then von Neumann appeals to the spectral resolution of the unitary operator U in 


terms of a projection-valued measure onthe unit.circle; pulled back to the.real.axis _.. 
by (t —f)/(t +1) =e! 9 this furnishes the spectral resolution of A. 


32.3 A FUNCTIONAL CALCULUS FOR SELF-ADJOINT OPERATORS 


The resolvent set of A consists of all numbers z for which A — zI maps D one-to-one 
onto H. The spectrum of A is the complement of the resolvent set. According to 
theorem 5, the spectrum of A lies on the real axis. 


Exercise I. Shows that the spectrum of an unbounded self-adjoint operator is a 
closed, unbounded set on the real axis. 


The extended spectrum of an unbounded self-adjoint operator is its spectrum com- 
pactified by adjoining oo. l 

In this section we will define f(A) for every f that is continuous and real valued 
on the extended spectrum of the self-adjoint operator A. 

With start with a few observations about powers of a self-adjoint operator. For any 
natural number k, the domain of AÑ, denoted as D(A*), consists of all vectors x for 
which x, Ax,..., A‘~!x lie in D, the domain of the operator A. . 


Exercise 2. Show that A* is a self-adjoint operator. 
Lemma 9. For any self-adjoint operator A acting on a Hilbert space H, 


(i) A? +I has a bounded inverse. 
(ii) (4? +D maps H into the domain of A”, 


Proof. (i) according to theorem 5, A + iI and A — iJ both have bounded inverses. 
Therefore so does their product 


A? +I = (A +iD (A — il). 


(ii) We show first that both (A +iI)~! and (A — iI)! map D(A!) into D(A‘). 
To see this, let y denote any vector in D(A‘-!): by theorem 5, (A+ iI)7!y =x 


A FUNCTIONAL CALCULUS FOR SELF-ADJOINT OPERATORS 391 


belongs to D. For k = 1, this is what we want to know; for k > 1 we write y = 
(A + iD)x and rearrange it as 


Ax =y—ix, . (31) 


which shows that x belongs to D(A*). For k = 2, this is what we want to know; for 
k > 2, let A act on (31), and use (31) on the right: 


A?x = Ay —iAx = Ay — iy — x. 
This shows that x belongs to D(A), and so on, until we place x in D(A‘), Ditto for 


(A-iD7!. 


We write now 
(A? +D = (A = iD A + DTT. oa 


Since the k factor (A+ iD! maps D(sAk-!) into Dak ), the 27 factors on the right 
in (32) map H into D(A"). g 


Exercise 3. Show that (A? + 1)~" maps H onto the domain of A?”. 


Let q(A) be any polynomial of degree < 2n. It follows from lemma 9 that 
q(A)(A? + I~" is an everywhere defined, bounded operator. 


Exercise 4. Show that if the coefficients of q are real, g (A) (A2+1)—" is symmetric. 
(Hint: First show that the domain of A? is dense in H.) 


The following version of the spectral mapping theorem holds: 


Lemma 10. Let q denote a polynomial with real coefficients of degree < 2n and 
abbreviate q (À) (A2+1)7! asr (à). The spectrum of r(A) consists of all real numbers 
o of forma =r(A), À in the extended spectrum of A. 


Proof. Write 


ra) —o = QA -o(Q +") A241. 


The numerator has real and complex zeros, in conjugate pairs. Factor it as 


ee ee eee eee a 
r-o =]](@-0 +) [] 0-00 63 
l l 
` Then 


2(n—k) 


k 
r(A) -o1=]|] (4-0 +451) [] a-a +o. 
l l 


392 SPECTRAL THEORY OF SELF-ADJOINT OPERATORS 


The partial product 


. i 
een eee [ [4 - ob? +a +0 
l 


is a one-to-one map of H-onto-H. The-remaining factor is a one-to-one map of H 
Onto H. iff all the Av_belong to the resolvent set of A, and if the total number of zeros 
of the numerator is 2n. Since it follows from (33) that r(Ag) = ø, thé two conditions 
“above for the invertibility of r (A) — oI! can be stated so: r(2) Æ o for any real A in 


the spectrum of A, and r(co) Æ oø. B 


—— According to-lemma-9-and-exercise-4;-for-the-rational-functions r¢)-considered 
above, r(A) is a bounded, symmetric operator. We appeal now to theorem 3 of chap- 
ter 31: |r (A)|| equals the spectral radius of r(A). According to lemma 10 the spec- 
trum of r(A) is the range of r (à) on the extended spectrum o (A) of A. Therefore 


lrA) San ah a)l. (34) 


The rational functions r described above form an algebra over the reals. We claim 
that they separate points of the extended real line. Clearly, if à} and A> have the same 
sign, (A2+1)7! has different values at A, and A2; ditto if A; or Az is œœ. If à; and Aa 
have opposite signs, A(à? + 1)7! separates them. Furthermore the constant function 
belongs to the algebra. We appeal now to the 


Stone-Weierstrass Theorem. An algebra of real valued functions on a compact 
Hausdorff space that separates points, and contains the constant functions, is dense 
in the space of all continuous functions, normed by the maximum norm. 


For a proof see section 3 of chapter 13. 

It follows that the rational functions r(A) defined above are dense in the space 
of all continuous functions f on the extended spectrum of A compactified by the 
addition of the point oo. That is, every such continuous function can be approximated 
uniformly by a sequence of rational functions: 


limr, = f. 
It follows that {rg} is a Cauchy sequence: 
R: mar re) —ry(A)| > 0. 
We apply now (34) and conclude that ||r (A) — re (ADI t tends to zero as k, £ tend to 
oo. The norm limit of the sequence of operator rg (A) is defined as f (A). 
It is easy to verify that the functional calculus we have just defined has all the 


properties listed in theorem 5 of chapter 31. This functional calculus can be used, 
Just as it was done in chapter 31, section 31.3, to construct a spectral resolution of 


BIBLIOGRAPHY 393 


the operator A. A spectral representation can be built along the lines of chapter 31, 
section 31.5. 


NOTES. The first unbounded operators for which a spectral theory has been de- 
veloped were differential operators, ordinary and partial. These will be discussed, 
lightly, in chapter 33. The first general spectral theory for unbounded integral opera- 
tors was developed by Carleman, in the context of singular integral operators. 

In the bibliography we list the early contributions to the spectral theory of self- 
adjoint operators. 


BIBLIOGRAPHY 


Carleman, T. Sur les equations integrales singulières à noyou réel symmétrique. Almquist and Wiksells, 
Uppsala, 1923. 


Doob, J. L. and Koopman, B. O. On analytic functions with positive imaginary parts. Bull. AMS, 40 
(1934): 601-606. 


Hellinger, E. and Toeplitz, O. Grundlagen einer Theorie der Unendlichen Matricen. Wath. Ann., 69 (1910): 
289-330. 


Lengyel, B. A. and Stone, M. H. Elementary proof of the spectral theorem. An. Math., 37 (1936): 853-864. 


Lorch, E. R. Functions of self-adjoint transformations in Hilber space. Acta. Sc. Math. Szeged, 7 (1934): 
136-146. 


Nevalinna, R. Asymptotische Entwickelungen beschrinkter Funktionen und das Stieltjessche Moment- 
problem. Ann. Acad. Sci. Fennicae, A 18 (1922). 


Riesz, F. Uber die linearen Transformationen des komplexen Hilbertschen Raumes, Acta Sci. Math. 
Szeged, 5 (1930): 23-54. 


Riesz, F. and Lorch, E. R. The integral representation of unbounded self-adjoint transformations in Hilbert 
space. Trans. AMS, 39 (1936): 331-340. 


Stone, M. H. Linear transformations in Hilbert space, and their application to analysis. AMS Coll. Publ., 
15. American Mathemaical Society, New York, 1932. 


Sz.-Nagy, B. Spectraldarstelling linearer Transformationen des Hilbertschen Raumes. Ergebnisse der 
Math., 5. Springer, Berlin, 1942. : 


Pane xon Neumann, J. Allgemeine Eigenwerttheorie Hermitescher. Functionaloperatoren. Math. An., 102 
(1929): 49-131. 


ws 


EXAMPLES OF SELF-ADJOINT _ 
OPERATORS © = == -~ 


The definition of self-adjointness demands that the domain of such an operator be 
specified with the greatest precision. This is possible in some cases, but for most par- 
tial differential operators with variable coefficients defined on domains and subject 
to various boundary conditions, it is not possible—and not useful—to give an ex- 
act description of their domain. Instead, such operators are defined by some suitable 
process of extension. In the first part of this chapter we describe such processes. 


33.1 THE EXTENSION OF UNBOUNDED SYMMETRIC OPERATORS 


Definition. An operator C is called an extension of operator B if the domain of C 
contains the domain of B and Cu = Bu on their common domain. 


We are given a linear operator B mapping a dense subspace D(B) of a Hilbert 
space H into H that is symmetric: 


(Bu, v) = (u, Bv) (1) 
for all u and v in D(B). We pose the following questions: 


(i) Is it possible to extend B to a self-adjoint operator? 
(ii) In how many ways? 
(iii) By what process? 


We recall the notion of a closed operator as one whose graph in H x H is closed. 


` We spell it out: 


Definition. An operator C mapping a dense subspace D(C) of H into H is closed 
if for every sequence {un} in D(C) that converges to a limit u in H and for which 
{Cun} converges to a limit w in H, u belongs to the domain of C and Cu = w. 


394 


THE EXTENSION OF UNBOUNDED SYMMETRIC OPERATORS 395 


We describe now how to extend minimally any densely defined symmetric opera- 
tor B to a closed symmetric operator. This minimal extension is called the closure of 
B and is denoted as B. Take any sequence {un} of vectors in D(B) that converges to 
a limit u, and for which {Bun} converges to a limit w. Set u = un in (1): 


(Bun, v) = (un, Bv), 
and let n —> oo. We obtain that 
(w, v) = (u, Bv) (1^) 


for all v in D(B). Since D(B) is dense in H, it follows from (1^) that the vector w is 
uniquely determined by the vector u. 


Definition. The closure B of B is defined by setting Bu = w for all u, w satisfying 
(1’) for all v in D(B). 


Exercise 1. Show that if a closed operator C is an extension of a densely defined 
symmetric operator B, then C is an extension of B as well. 


Theorem 1. Let B be a densely defined symmetric operator, B its closure. 


(i) B is closed. 
(ii) B is symmetric. 
(iii) For any nonreal complex number z, B-z maps D(B) one-to-one onto a 
closed subspace of H. 


Proof, (i) It follows by an easy, tedious argument from the construction of B that 
B is a closed operator. 

(ii) To show that B is symmetric, take any v in D(B) and any sequence {vn} in 
D(B) that converges to v, and for which Bu, converges to Bv. Set v = v, in (1’), 
and let n + co. We obtain i 


(Bu, v) = (u, Bv), 


.- the symmetry of Beare ESTES iiot paaa a ae oe 


(iii) Let z be any complex number, u any vector in the domain of B, and denote DY ck ea en 


f the vector 


Take the scalar product with u: 


(Bu, u) — z(u, u) = (f, 4). 


E 
ve 


396 EXAMPLES OF SELF-ADJOINT OPERATORS 


Since B is symmetric, the first term on the left is real; since the imaginary parts of. 
the two sides are equal, a 


lima fal]? = [na fae) F< AH o nee 


It follows from this inequality that 


l i o-n MIE a k 7 
welll $ ES + (2) 


this implies that the operator C — z is one-to-one. 


Let {fn} be a Sequence of vectors in the range of B —z that converges to_f in-H:——__ 


D 


(B ah z) ln = Jna. (2') 


It follows from (2) that {uw} converges to some limit u in H. But then, by (2’), so does 
Bun. Since B is a closed operator, u belongs to the domain of B and Bu = f + zu. 


Oi 
Here are a few useful corollaries: 
Corollary 1. A self-adjoint operator A is closed. 


Proof. Since by part (ii) of theorem 1, A is symmetric, the domain of A is con- 


tained in the domain of A*, Since A* has the same domain as A, the corollary fol- 
lows. E 


Combining corollary 1 with Exercise 1, we deduce 


Corollary 1’. When a self-adjoint operator A is an extension of a densely defined 
symmetric operator B; then A is an extension of B as well. 


The following result is basic for identifying self-adjoint operators. 


Theorem 2. A symmetric operator A is self-adjoint iff all nonreal complex numbers 
z belong to its resolvent set. 


Proof. That every nonreal complex z belong to the resolvent set of a self-adjoint 
operator has been demonstrated in theorem 5 of chapter 32. To show the converse, 
we take the case that for some z nonreal, both z and Z belong to the resolvent set 
of A. We show first that (A — z)7! is the adjoint of (A —z)~!. The meaning of this 
is that for any pair of vector f and gin A, 


(a -37 fie) = (4-378). 3) 


` EXTENSION OF SYMMETRIC OPERATORS; DEFICIENCY INDICES 397 


To see why this is so, we abbreviate 
(A-2)|f=x, (A-3! g =y; (4) 
then we can rewrite (3) as 
(x, (A —Z)y) = ((A = z) x, y). (4) 


Since A is symmetric, this is valid for all x and y in the domain of A. Since A — z 
and A — Z map D(A) onto H, it follows that (3) holds for all f and g in H. 

We are now ready to prove that A is self-adjoint. What we have to show is that if 
v belongs to the domain of A*, then v belongs to the domain of A and A*v = Av. 
Now v belongs to the domain of A*, with A*v = w, if forall x in D(A), 


(Ax, v) = (x, w). (5) 
Subtracting z(x, v) from both sides yields 
(A — z) x, v) = (x, w — Fv). 
Using the abbreviation (4) and relation (3), with g = w — Zu, we can rewrite this as 
(fv) = (4-37 w- z) = (FA -D7 w zv). (5) 
Since (5) holds for all x in D(A), (5)’ holds for all f in H; it follows therefore that 
v= (A-3)! (w — zv). 
Since the range of (A — 7)! is D(A), it follows that v belongs to D(A): acting 


on both sides by A — Z shows that Av = w. Since w was defined above as A*v, 
A*v= Av. 


NOTE. In proving the converse, we used only the assumption that some nonren 
and z belong to the resolvent set of A. 


33.2 EXAMPLES OF THE EXTENSION OF SYMMETRIC OPERATORS; 
DEFICIENCY INDICES 


We turn now to some examples that illustrate the notion of closure of a symmetric. 
~~ Operator, and the possibilities of self-adjoint extensions. 


Example I. 


Definition. Denote by H the Hilbert space L?(R), and define the operator B as 
‘i(d/dx) acting on the domain D(B) = Gh consisting of all once differentiable func- 
tions on R with compact support. 


398 EXAMPLES OF SELF-ADJOINT OPERATORS 


Proposition. -B is symmetric, and its closure B is self-adjoint. 


Proof. Integration by parts shows that B is symmetric. Let z be any complex num- 
-ber. The range of B-— z-consists-of-all-functions-f-of form---- > eee e ee me e 


a 
oe — zu = k uE Cå. (6) 


-Multiply by e3: =o o = e m me emee -= 


ps (Eu) = E= f, a ar ae o (6) 


Integrate over R; since u has compact support, we get 


o= [ee fax (7) 
—009 


a condition satisfied by every function f in the range of B — z. Conversely, every Co 
function f on R that satisfies (7) belongs to the range of B — z. To see this, define u 
by 


u(x) = —i f i e20) F(y) dy. (8) 
uQ ae 


Clearly, u has continuous derivative, and if f is zero outside a compact interval S, it 
follows from (7) and (8) that wu too is zero outside S. 

The function e/** is not square integrable on R; therefore the set of continuous 
functions f of compact support that satisfy condition (7) is a dense subset of L? (R). 
According to part (iii) of theorem 1, for z nonreal the range of B — z is closed; so 
the range is all of H. Since B — z is one-to-one, z belongs to the resolvent set of B. 
According to theorem 2, it follows that B is self-adjoint. 

A symmetric operator whose closure is self-adjoint is called essentially self- 
adjoint. O 


Example 2. 


Definition. Denote by H the Hilbert space L?(R+), and define the operator B as 
i(d/dx) acting on the domain D(B) = Çi consisting of all once differentiable func- 
tions whose support is a compact subset of (0, 00). 


Proposition. B is symmetric, but its closure B is not self-adjoint. Furthermore B 
has no self-adjoint extension whatsoever. 


Proof. Symmetry of B follows by integration by parts, since all functions in the 
domain of B are zero near 0 and oo. Arguing as in example 1, we conclude that a 


EXTENSION OF SYMMETRIC OPERATORS; DEFICIENCY INDICES 399 


continuous function of compact support on R+ belongs to the range of B — z iff 
co . 
0= f e™ fdx. (9) 
0 


For Imz < 0, the function ef?" is not square integrable on R-+; therefore the set of 
Co functions f satisfying (9) is dense in L?(R4) = H. It follows from theorem 2 
that the range of B — z is all of H. 

It is otherwise when Imz > 0, for then the function e!** is square integrable 
on R4, and therefore the range of B — z consists of all f in H that satisfy the 
orthogonality condition (9). It follows that B is not self-adjoint. 

To see that B has no self-adjoint extension, we note that according to corollary 1, 
such an extension A would have to be an extension of B as well. Let v be a function 
in the domain of A that does not belong to the domain of B. Choose any z with 
Imz < 0; since B — z maps D(B) onto H, there is function u in D(B) such that 


(@B-z)u = (A = z)v. 
Since A is an extension of B, 
A-3) w- u) =0. 


This is impossible unless v—u = 0, for A is symmetric, and so, according to theorem 
1, A — z has no nonzero nullvector for complex z. o 


Example 3. 


Definition. Take H = L? (0, 1)B = i(d/dx) acting on functions u in D(B) = C4 
consisting of continuously differentiable functions on [0,1] that vanish at x = 0 
and 1. : 


Proposition. B is symmetric, but its closure B is not self-adjoint. However, B has 
self-adjoint extensions. š, 


Proof. Symmetry of B follows by integration by parts. Arguing as before, we see 
that the range of B — z, Imz Æ 0, consists of all L? functions f that satisfy the 
orthogonality condition 


l 2. =- pase y t snes TUTE ee tne ore aa in a aae eee eae wee a 
PPE a garg Eg ee 


0 


According to theorem 2, B is not self-adjoint. 

We construct now some self-adjoint extensions of B. 
Definition. Let a be any complex number Æ 1 but of absolute value 1 : |a| = 1. 
Define Ay to be the operator i(d/dx) acting on all C! functions that satisfy the 


400 ‘ EXAMPLES OF SELF-ADJOINT OPERATORS 


-boundary condition 


= u(1) = au(Q). so 


Clearly, Ag is an extension of B. 


Exercise 2. Show that Ag is symmetric. 


~ “We show now that the closure of Aq is self-adjoint: Take any complex z-Inrz #0; 
we claim that every continuous function f. belongs to the range of Ag — z. To show 
this, we integrate (6’) from 0 to x. Denoting by c the value of .u at zero, we get 


ua) e i f OV yd 
0 


In particular, 


ae 
ul) =e-i f 20-1) Fy) dy. 
0 


Setting u(1) = ac gives an equation for c that, for a # 1, has a unique solution. 
Since the continuous functions are dense in L?, it follows that the range of Ag — z 
is dense. Therefore the range of Ay — z is all of H. By theorem 2, Ag is.self-adjoint. 


These examples illustrate well the extension problem for symmetric operators. 
The general result is due to von Neumann. 


Theorem 3. Let C be a densely defined, closed symmetric operator in a Hilbert 
space H. According to theorem J, for Imz # 0 the range of C — z is a closed 
subspace of H. 


(i) The codimension of the range of C — z is the same for all z withImz > 0. 
Similarly the codimensions are the same for all z with Imz < 0. These codi- 
mensions, denoted as n4 and n_, are called the deficiency indices of the 
operator C. 

(ii) C has a self-adjoint extension iff na. = n_. 


Proof. We sketch the proof of (ii). Form the Cayley transform of C: 
V=(C-i)(C +i). (10) 


V maps the range of C-+i onto the range of C—i. As shown in the proof of theorem 8 
_in chapter 32, V is an isometry. Clearly, an isometry can be extended to a unitary 
operator U iff the codimension of its domain and range are equal. Suppose that n+ = 
n—; then V can be so extended. Call this extension U. We claim that the inverse 


EXTENSION OF SYMMETRIC OPERATORS; DEFICIENCY INDICES 401 


Cayley transform of U, 
A=i2+U)d-U) |, (10°) 


defined on the range of I — U, is a self-adjoint extension of C. First we have to show 
that U has a Cayley inverse, namely that I — U has no nullvector. To see that I — U 
annihilates only the zero vector, suppose that (I — U)n = 0. By adjointness, for any 
vector y, 


0 = (€ — U)n, y) = (n, A — U*)y); 


it follows that n is orthogonal to the range of I — U*. Since U is unitary, this equals 
the range of (I — U*)U =U- UU =U-L. 

From formula (10) we get that Y — I = —2i(C + iD7!. According to theorem 
1, the range of (C + iD)! is the domain of C, a dense subspace of H. Since U is 
an extension of V, the range of I — U contains the range of I — V, and so it too is 
dense in H. This proves n = 0, and so I — U is invertible. Since according to (10°) 
the domain of A is the range of I — A, it follows that A is densely defined. 

Next we show that A is symmetric. Let u and v be a pair of vectors in the domain 
of A. By definition (10’) of A, 


(Au, v) =i (a +U- Ulu, v) ; 
By adjointness, this equals 


i (u, -U a4 U*)v) l 


Since U is unitary, U* = U- !; so the above can be rewritten as 
i (u, €- UDA + Uv) = i (u, (U -D'UA + Up) 
= (u, (U -D7!(U + Dv) = (u, Av). 
In the last step we used the fact that in formula (10°) defining A, the two factors 
commute. 


To show that A is not only symmetric but self-adjoint we use theorem 2 and ver- 
ify that every nonreal complex number z belongs to the resolvent set of A. Using 


definition (10’) of A, we write 


A-d=i(+U+izd—-U)) AU)! =i (ize —iz)U)d-—U)T!. 


We saw in chapter 31, section 31.7, that the spectrum of a unitary operator lies on the 
unit circle. Since for nonreal z, (1 + /z)/(1 — iz) does not lie on the unit circle, the 


first factor on the right above maps H onto H; the second factor maps the domain öf ~ ` 


A onto H. This shows that A — zI maps the domain of A one-to-one onto H. O 


“mt 


Exercise 3. Prove part (i) of theorem 3. 


402 EXAMPLES OF SELF-ADJOINT OPERATORS 


REMARK. The deficiency indices are nonnegative integers or 00. We assume that H 
is separable, so there is only-one-kind-of oo ——— i 


Exercise 4. What are e the deficiency indices of f Examples 1, 2, and 32. 


~ Theorem 3 has this important ~~" 77 =- sees 


Corollary. Let K be a Hilbert space over the reals, and B a densely defined sym- 


____ metric operator « on K. Such a B has a 1 self- adjoint extension to ihe comp lexification _ 


OF K. ae 


Proof. The complexification of K is H = K +iK; there is a natural extension 
of B to H. There is a natural complex conjugation in H: u Fiv = u — iv, u and v 
in K. Conjugation commutes with the action of B. 

Denote by C the closure of B; C too commutes with complex conjugation. It 
follows that the range of C — zI is the complex conjugate of the range of C — ZI. It 
follows that the codimension of the range of C — zI equals the codimensions of the 
range of C — Zi. This proves that the deficiency indices of C are equal; so according 
to theorem 3, C has a self-adjoint extension. 0 


33.3 THE FRIEDRICHS EXTENSION 


In this section we describe an enormously useful method, due to Friedrichs, of con- 
structing a self-adjoint extension of a large class of symmetric operator, such as 
Schroedinger operators. The way this extension is carried out imposes certain bound- 
ary conditions putea: 


Definition. A symmetric operator L defined on a dense subspace D of a Hilbert 
space H is semibounded (from below) if 


cllu? < (u, Lu) (11) 
for some constant c and all u in D. 


In what follows we will take the constant c to be 1; this can be accomplished by 
augmenting L by a sufficiently large multiple of the identity. We define on D a new 
scalar product, denoted by (v, w)y,, as follows: 


(v, u) = (v, Lu). (12) 


The symmetry of the operator L guarantees that (v, u)p is skew symmetric, and 
semiboundedness, with c = 1, shows that (u, )z, is positive. We define the L-norm 


THE FRIEDRICHS EXTENSION 


as 


lult = (u, wy”. 


403 


It follows from (11) with c = 1 that the L-norm of every u in D is bigger than its 


original norm: 


lal < lall- 


ar’) 


The subspace D is, in general, not complete in the L-norm; we can complete it. 
Denote its completion by Hy; it consists of equivalence classes of Cauchy sequences 
in the L-norm. By (11’), a Cauchy sequence in the L-norm is also a Cauchy sequence 
in the norm of H. Since H is complete, such a Cauchy sequence has a limit in H, 


this defines a natural mapping of Hy into H. 


Lemma. The natural mapping of H, into H is one-to-one. 


Proof. Let {uy} be a Cauchy sequence in the L-norm of vectors in D that tends 
to uw! in HL. As noted above, {un} is a Cauchy sequence also in the original norm; 
denote its limit in H by u. By definition of the L-scalar product, for every v in D, 


(tn, VL = (itn, Lv). 


The limit of this relation as n tends to oo is 


(ut, v) = (u, Lv). 


This shows that the L-scalar product of u} with any v in D is uniquely determined 


by u. Since D is a dense subspace of HL, ub is completely determined by u. 


O 


In view of the lemma, the natural map HL —> H is an embedding of Hy, into H. 
_ We will regard Aly as a subspace of Ħ; note that D is contained in Hg. We will | 
define now the Friedrichs extension of L, to be denoted as LF, as follows: Take any 


vector g in H, and for every v in H define 


_ £0) =, Lg). 


£ is a bounded linear functional of v: 
[e(v)| < lvl gk. 


It follows from (11’) that for all v in HL, 


Eœ) < lvli lgl; 


(14) 


(14°) 


aAa 


ye 


404 EXAMPLES OF SELF-ADJOINT OPERATORS 


this shows that €(v) is a bounded linear functional on Hy,: According to the Riesz- 
Frechet representation theorem, we can write for all v in HL, 


EEA E a E E N E OOR TAIA ES cath iain eee ak -~-£(v) = (v; w)L, (13’) 


w some vector in Hy. We denote the set of all such vectors was DF... 
It follows from (13’) that for all v in Ay, the value of £(v) is determined by w 


Comparing this with (13), we conclude that g is determined by w, meaning that g is ` 
~~ function of w Since ¢ depends linearly on g, it follows that gis alinear function 
_ of w. We denote this function as L4: 


atone: Liw=g. wind’. (15) 


Combining the two representations, (13) and (13^) of £(v), and using the definitions 
(15), we get 


(v, w) = (v, LE w) (16) 


for all w in DF and all v in Ay. 
Take g in (13) to be g = Lu, u some vector in D. Then for all v in D, 


£(v) = (v, 8) = (v, Lu) = (v, u)L. 


Comparing this with (13’), we conclude that u = w and L? w = Lu. In words, D is 
a subspace of DF, and L” is an extension of L. 


Theorem 4. LF is a self-adjoint extension of L. 


Proof. We show first that LF is symmetric. Restrict in (16) the vector v to DF; 
interchanging w and v in (16) gives 


(w, v)L = (w, LF). 
Since both scalar products are skew-symmetric, we deduce from this that 
(v, wy = (LF v, w). 


Comparing this with (16), we conclude that LY is symmetric. 

The vector g in the definition (13) of the functional £ is an arbitrary vector in H; 
the vector w in (15) is uniquely determined by g. This shows that L% is an invertible 
operator, namely that it maps its domain DF one-to-one onto H. 

Denote the inverse of LF by M. Since LF is symmetric, so is M. According to 
the Hellinger-Toeplitz result-described at the beginning of chapter 32, the symmetric 
operator M is bounded. According to theorem 2 of chapter 31, every nonreal complex 
number z belongs to the resolvent set of such an operator M. The formula 


z271- M7! =M7'(M—<Dz7! (17) 


THE FRIEDRICHS EXTENSION . 405 


shows that then z~! belongs to the resolvent set of M~!. According to theorem 2, 
this implies that M~! = Lf is self-adjoint. g 


Exercise 5. Show that the inverse of a symmetric operator is symmetric. 


We give now some examples of semibounded operator and their Friedrichs exten- 
sion. 


Example 4. H = L? (0, 1),L= —(d?/dx?) + 4q, q some continuous function on 


[0, 1]. The domain of L is Ge (0, 1). Since every u in D(L) is zero at the endpoints, 
integration by parts gives 


ulig, = (u, Lu) = / (aż + qu?) dx. 


Clearly, inequality (11) is satisfied, with c = ming. Let us assume that c = 
ming = 1. 


Proposition. Every function in Hy, is continuous on the closed interval [0, 1] and 
vanishes at the endpoint. 


Proof. For every u in Ci, we deduce using the Schwarz inequality that for every 


a, b in [0, 1] that 
b 
f uydx 
a 


It follows that a Cauchy sequence in the L-norm converges uniformly, and that the 
limit u in Hz, satisfies (18)-and is zero at the endpoints. g 


b 1/2 
ju(b) — u(a)| = <Vb—a (/ ax) <Jb—allully. (18) 
a 


Since D(L*) is contained in Hy, this shows that the extension process imposes 
zero Dirichlet boundary conditions on functions in the domain of L^. 


Exercise 6. Show that the closure of the operator L in example 4 is not self-adjoint. 
Exercise 7. H = L?(0, 1), L = —(d/dx) p(d/dx) + q, where p is a positive func- 
~->~L"-is-continuous-in-[0;1]-and-is zero at the endpoints... 


Example 5. G abounded domain in the x, y plane, H = L?(G), the space of square 
integrable functions in G, L = -A = —(82 + Os), the domain of L is the space 


Ch(G). 


Proposition. When G has a smooth boundary, every function in the domain of Lf 
vanishes on the boundary of G in the following sense: 


tion-in-E+-g-in-G; the-domain-of-L-is-G*(0;-1)- Show-that every: u-in-the domain-of» =- --. 


406 , l EXAMPLES OF SELF-ADJOINT OPERATORS 


meres 
[was 
n 


tends to zero as the curves Cy approach the curve boundary G, and the tangents of 
Cn approach the tangent of C. 


~ Exercise 8. Prove this proposition. (Hint: Show that ||u Iz = fa? + u?) dx dy.) 


-Exercise 9: - Show that the operator E in example 5'has deficiency indices oò, 067” 


33.4 THE RELLICH PERTURBATION THEOREM... ___ 


In this section we present a result of Rellich that says, roughly speaking, that if we 
add to a self-adjoint operator A a symmetric operator T that is not too large compared 
to A then the sum A + T is self-adjoint. Here is the precise result: 
Theorem 5. Let A denote a self-adjoint operator acting in a Hilbert space H, with 
domain D(A). Let T be a symmetric operator in H whose domain includes the do- 
main of A and which is smaller than A in this sense: there exist numbers a and b, 
b <1, such that for all u in D(A), 

Tul? < a?lul? +b? Aw|?. (19) 
Then A + T defined on D(A) is self-adjoint. 


Proof. We show first that the operator A + T is closed. As a first step we take the 
square root of inequality (19) and deduce 


Tull < allull + bil Au. (19°) 
For any u in D(A), i E 

Au = (A + T)u — Tu. 
So, by the triangle inequality combined with (19’), we get 

Aull < (A+ Dull + Tul] < (A+ Tul] + aljel] + bi Au. 
Since b < 1, it follows that 
Aull <b) A+ Dull + (1 — b) alu. 

It follows from this inequality that any vector in the domain of the closure of A + T 


belongs to the domain of the closure of A. Since A is self-adjoint, it is closed; it 
follows that so is A + T. l 


THE RELLICH PERTURBATION THEOREM i 407 


‘We will show now that ic and —ic belong to the resolvent set of A + T when 
c > a, where a is the constant appearing in inequality (19). Since A + T is a closed 
operator, we can appeal to theorem 1 and conclude that A+ T + ic maps D(A) one- 
to-one onto a closed subspace of H. We claim that this subspace is all of H; for it 
not, there would be a vector v in H perpendicular to the range of A + T + ic: 


(A+T +ic)u, v) =0 (20) 


for all u in D(A). Since A is self-adjoint, the range of A + ic is all of H so that there 
is w in D(A) that is mapped into v: 


(A +ic)w =v. 
Setting this into (20) and choosing u = w gives 
(Aw +icw, Aw + ics) + (Tw, Aw + ics) = 0. 


Estimating the second term by the Schwarz inequality yields after an algebraic ma- 
nipulation 


Aw +icwll? < Tw]. 


Since A is symmetric, the left side of equal Awl]? + c?||wi]*; the right side is 
bounded by (19): 


Awl? + c7|]wll? < lwl? +b Aw|?. 


Since b < | anda < c, we conclude that w = 0; this makes v = 0, and shows that 
the range of A + T + ic is all of H. Ditto for the range of A + T — ic. According to 
theorem 2, this implies that A + T is self-adjoint. o 


Corollary 2. Let B denote an essentially self-adjoint operator, that is, one whose 
closure A is self-adjoint. Let T denote a symmetric operator whose domain includes 
the domain of B and that satisfies inequality (19). Then the domain of T includes the 
domain of B and B + T is self-adjoint on D(B). vx? 


Exercise 10. Prove corollary 5. 


Example 6. H is L? (R), B = —d? /dx*, 2 D(B) = CH (R), the space of twice dif- 


ferentiable functions of compact support. T is multiplication by some real-valuéd “~~ 


- -function’g in class L2;-D(¥)-consist of all Co(R) functions. 


Proposition. 


(i) B is essentially self-adjoint. 
(ii) T is bounded by B in the sense of inequality (19). 


cu 


408 EXAMPLES OF SELF-ADJOINT OPERATORS 


Proof. Part (i) is proved by showing that for nonreal z the range.of B — zI is a 
dense subset of H. This can be verified following-the analysis-given-in-example-Lin 
section 33.2. To show part (ii), we start with inequality (18) derived in the discussion... . 
of example 4: 


|u(b) — u(a)| < Vb —aljus |. (18) 


_given any point a in R, there is a point b in the interval [a — L, a +4] suchthat —————- 


a+(1/2) (1/2) 
lu(b)| < | op ee) s 
rasta 


Then we deduce from (18) that for any a, 


f ] 
jue(a)l < Wall + Ste. 


Taking the sup over all a, we conclude that 


lelase < lul + ybu 
Squaring this inequality gives 

lulis < 2u? + lux’. (21) 
Integration by parts gives 


(Bu, u) =— / Uyylidx = fous? dx = jux’. 


Applying the Schwarz inequality on the left followed by the arithmetic-geometric 
mean inequality gives 


1 € 9 
WJ? < up? Bul? < lull? + = Bull? (22) 
2€ 2 
for any positive e. Putting together the two inequalities (21) and (22) gives 
lue < (24 =) lul? + ÉliBu? (23) 
l Le S| Je Lt 7 U s ae 
We turn now to T: 
Tul? = [ow dx < [Paris = Qluli oo, 


where Q abbreviates f grdx. Using the estimate (23) on the right side, we get 


THE RELLICH PERTURBATION THEOREM 409 
2 ree: 2 € 2 
Tul < O(2+ Fe lah” + Qz (Bull f (24) 


Taking € small enough, the coefficient of ||Bu \|? in (24) can be made < 1; this shows 
that inequality (19) is satisfied for all u in D(B). We appeal now to corollary 5 to 
conclude that B + T is self-adjoint. Oo 


Exercise 11. Prove that B is essentially self-adjoint. 


Example 7. H is L?(R), B = —d*/dx?, and D(B) consist of twice differentiable 
functions of compact support that are zero at the origin. T is multiplication by a 
real-valued function q subject to these restrictions: 


(i) ll S pap for ix] < |l; p < 1. 


(ii) q is L? outside of the interval [—1, 1]. 
Proposition. 


(i) B is essentially self-adjoint. 
(ti) T is bounded by B in the sense of inequality (19). 


Proof. For part (i), see the remark in the previous example. For part (ii), we need 
another inequality in addition to (23). It too is based on inequality (18). We set a = 0 
and use the fact that w(0) = 0. Squaring, we get 


l(b)? < Jbl lux’. (25) 


Next we break up the integral for ||Tu \|2 into two parts: 


Tal? z. qu? dx =f+f ; (26) 
R 1 R-/ i 


where / is the interval [—1, 1]. On J we use the bound (i) for q and the estimate (25) 
for u: 


A 3 : 
x cn 
[auras < ef | l dxljuxl? = sie, 
Ea F oe tne one Ck 5 4 et |: —p— 
We use (22) to estimate lux? on the right side: 


, 
7 a E E 2 
fe ed Cal + SIBul ). 27 


We estimate the right side of (26). using (27) for the first term, and (24) for the second 
term: 


410 EXAMPLES OF SELF-ADJOINT OPERATORS 


Tul]? < 20 hull? + (5 4 o) (stm? $ 5 (Bu?) oe) 
-p 2e 2 AEE 


Since € is arbitrary, we can take it so small that the coefficient of Bal im (28) is 
less than 1. So an inequality of form (19) is satisfied for all u in D(B). By corollary 
5,B+T is self-adjoint. ` o 


Exercise 12. Carry out the details of the proof that B is essentially self-adjoint. on 


33.5 THE MOMENT PROBLEM 


In chapter 14, section 14.7, we iae formulated the 


Hamburger Moment Problem. What sequences ag, a4, ... of real numbers can 


be represented as moments of a mass distribution on R: 
an = f tdm, p (29) 
R 


where im is a nonnegative measure whose support is larger than a finite set of points. 


As observed in chapter 14, every sequence ag, a}, ... of form (29) is Hankel pos- 
itive, which means that the quadratic form 


Q = Lan +n Enk, (30) 
is positive for every nonzero choice of real BADER &,...,&y, N arbitrary. This is 
evident from the formula 
Danpkéntk = [Eitan = (arte) dm. (30) 


Clearly, the right side of (30') is nonnegative. Furthermore, when the support of the 
measure m. is not finite, it is not confined to the zeros of the polynomial Dé,t”; this 
shows that the right side of (30') is positive. Note that it follows that ag is positive. 

Hans Hamburger has shown that this necessary condition for the a, to be repre- 
sented in form (29) is also sufficient: 


Theorem 6 (Hamburger). Let {an} be a sequence of real numbers such that the 
quadratic form (30) is positive. The the ay are the moments of a nonnegative measure 


on the real axis R; that is, they can be represented in the form (29). 


Proof. Denote by D the linear space of all finite sequences of real numbers: 


x = (£0, 1,...,Ew,0,...). 


THE MOMENT PROBLEM 411 


Define the scalar product (x, y) on D by the quadratic form Q in (30): 


(x, y) = Q(x, y) = Zantknnk. (31) 


Denote the completion of D with respect to the norm ||x|| = Q!/ 2(x, x) by K, and 
denote by H = K + iK the complexification of K. 
Define on D the operator R as right shift, that is, 


Rx = (0, &, &,...). (32) 
Denote by e the unit vector in D: 
e = (1,0,0,...). 
Then 
R”e = (0,...,0,1,0..), 
and it follows from the definition (31) of the scalar product that 
(e, R"e) = an. (33) 
R is symmetric, for we can write, denoting n — 1 as £, 
(Rx, y) = Zankn—1Nk = Verner Eek. (34) 


The sum on the right is, clearly, a symmetric function of x and y. 

Since R is a symmetric operator acting on D, a dense subspace of the real Hilbert 
space K, it has, according to the corollary of theorem 3, a self-adjoint extension to 
the complexification of H of K. Denote by E the projection-valued measure that 
gives the spectral resolution of such an extension of R. According to theorem 1 of 
chapter 32, 


(e R'e) = | dRe, o; 


combining this with (33), we get the sought-after representation (29), with m = 
(Ee, e). o 


adjoint operators, it takes less than a page. 


A related problem is the Stieljes moment problem used to represent a sequence of 
real numbers as the moments of a mass distribution on the positive real axis Ri: 


an z. t" dm. (35) 
Ry 


_. NOTE._It took Hamburger 150 pages to prove his theorem. Using the theory of self- . 


£ 


412 , i EXAMPLES OF SELF-ADJOINT OPERATORS 


The positivity of the quadratic form Q defined in (30) is certainly a condition neces- 
sary for a representation of form (35). So is the positivity of 


Dantk+l nék. (36) 


. This is evident.from the formula... _. _... 


Ean pte 1nb =| PET Eng am = f (EE) dm. 


Ry uas 


The positivity of (36) can be expressed by setting y = x in (34) and writing 


(RX; Y) = 0: i te eee ne ee ei i 


This inequality can be expressed by saying that R is a positive operator on D. Then 
R remains positive on the complexification of D, and so Friedrichs’ procedure can 
be used to extend R to H as a positive self-adjoint operator. As before, we use the 
spectral projections E of the Friedrichs extension of R to write 


an = (e,R"e) = fate, e). 


The support of the spectral measure E lies on the spectrum of the extension of R. 
Since this is a positive operator, its spectrum lies on the positive axis Rẹ; see theorem 
7 in chapter 31. It follows that the measure m = (Ee, e) is supported on the positive 
axis R. This completes the proof of the following proposition: 


Theorem 7 (Stieltjes). Let {an} be a sequence of real numbers such that the 
quadratic forms (30) and (36) are positive. Then the ay are the moments of a 
nonnegative measure on the positive axis R4; that is, they can be represented in the 
form (35). 


What about uniqueness of the representing measure? The following is a simple 

example of a sequence that can be represented as the moments of two distinct mea- 
sures: 
Let f be a real-valued even cx function on R that vanishes to oo order at the 
origin; that is, all its derivatives are zero there. Denote its Fourier transform by g. 
Since f is even, g is real valued, and since f is CS: g(t) tends to zero faster than 
any negative power of f as f tends to oo. By Fourier inversion, 


[ sire’ dt = f(s). 


Differentiate n times both sides with respect to s and set s = 0: 


THE MOMENT PROBLEM 413 


meaning that all moments of g are zero. Writing g as the difference of its positive 
and negative part, g = g+ — g_, we conclude that the nonnegative functions g4 and 
g- have the same moments. Since g is not = 0, g+ and g+ are distinct, a case of 
nonuniqueness for the Hamburger moment problem. 

It is almost as easy to give examples of moment problems whose solutions are 
unique, for instance, when the sequence of moments is bounded, 


lan] < const. 


An example of such a sequence is the moments of Lebesgue measure on [0, 1], and 
zero everywhere else. Here 


is a bounded sequence. 
Let r be an arbitrary positive number; multiply (29) by r”/n! and sum from 0 
to N: 


N p” N (rt)? 

> ay— = ) dm. 
n! n! 

0 0 


Since the ay are bounded, the left side is < const. e”. Since the integrand on the right 
is positive and tends to e”’, uniformly on any compact interval, as N tends to co, it 
follows that e” is integrable with respect to the measure m, and that 


co r” 
e"! dm = ) an—. 
n! 

0 


It follows that this relation holds for any complex r, in particular, for r = is: 


— : oyi 
fe dm = Lay us) ; 


n! 


- This shows that the. Fourier transform of m is uniquely determined by its moments 
ân. Since a finite measure is uniquely determined by its Fourier transform, it follows 

that the moments {a,,} uniquely determine the measure m. 
The story is similar for the Stieltjes moment problem. Stieltjes already has given 
an example of a sequence {a,,} that is the sequence of moments of two distinct mea- 
~——~-——sures-on-Rz;~and-the-other-hand sequences’ {aj;} that are the moment of a single 
„measure on R. : 
It is an interesting problem—and at any rate a classical one—to characterize those 

moment problems that have a unique solution. 


414 EXAMPLES OF SELF-ADJOINT OPERATORS 


Theorem 8. 


(i) The moment problem (29) has a unique solution iff the operator R is essen- 


tially self-adjoint, that is, has a unique self-adjoint extension. 


(ii) The moment problem (34) has a unique solution iff the oprao R hasa 
unique nonnegative self- adjoint extension. 


Neither part of the theorem is E E E EA 
extensions R; and Ro, it is not clear that the measures (Eye, e) and (Eze, e) are 
distinct. On the other hand, when R is not essentially self-adjoint, there are solutions 
m of the moment problem that are not of the form (Ee, e), where E is the spectral _ 

“resolution of Sditie self-adjoint extension of R 

For a proof of theorem 8, and for a review òf the literature of the moment problem, 
we refer to the article by Barry Simon; see also Henry Landau’s article in the AMS 
Symposium volume edited by him, as well as the books by Akhiezer, and Shohat and 
Tamarkin. 


HISTORICAL NOTES. Stieltjes introduced the integral named after him in connection 
with his work on the moment problem. 

The theory of self-adjoint operator was created by von Neumann to fashion a 
framework for quantum mechanics. The operators in Schrédinger’s theory that are 
associated with atoms are partial differential operators whose coefficients are sin- 
gular at certain points; these singularities correspond to the unbounded growth of 
the force between two electrons that approach each other. To define such differen- 
tial operators as self-adjoint ones is not a trivial task. Examples 5 and 6 presented 
in section 33.4 allow some singularities in the potential q, but the ones occurring 
in quantum mechanics are more singular still. I recall in the summer of 1951 the 
excitement and elation of von Neumann when he learned that Kato has proved the 
self-adjointness of the Schrödinger operator associated with the helium atom. 

And what do the physicists think of these matters? In the 1960s Friedrichs met 
Heisenberg, and used the occasion to express to-him the deep g gratitude of the com- 
munity of mathematicians for having created quantum mechanics, which gave birth - 
to the beautiful theory of operators in Hilbert space. Heisenberg allowed that this was 
so; Friedrichs then added that the mathematicians have, in some measure, returned 
the favor. Heisenberg looked noncommittal, so Friedrichs pointed out that it was 
a mathematician, von Neumann, who clarified the difference between a self-adjoint 
operator and one that is merely symmetric. “What’s the difference,” said Heisenberg. 


BIBLIOGRAPHY 


Akhiezer, N. L. The Classical Moment Problem. Hafner, New York, 1965. 
Friedrichs, K. O. Spektraltheorie halbbeschrinhter Operatoren. Math. An., 109 (1934): 465-487, 685-713. 


Hamburger, H. Uber eine Erweiterung des Stieltjesschen Moment Problems. Math. An., 81 (1920): 235- 
319, 82 (1921): 120-164, 168~187. 


BIBLIOGRAPHY 415 


Kato, T. Fundamental properties of Hamiltonian operators of Schrédinger type. Trans. AMS, 70 (1951): 
195-211. 


Kato, T. On the existence of solutions of the helium wave equation. Trans. AMS, 70 (1951): 212-218. 


Kato, T. Perturbation Theory for Linear Operators. Die Grundlehren der Math. Wiss. in Einzeldarstellung, 
132. Springer, Berlin, 1966. 


Landau, H. J., ed. Moments in Mathematics. Proc. Symp. Appl. Math., 37. American Mathematical Soci- 
ety, Providence, RI, 1987. 


Reed, M. and Simon, B. Methods of Modern Mathematical Physics: Vol. 1, Functional Analysis. Aca- 
demic Press, New York, 1972. 


Rellich, F. Störungstheorie der Spektralzerlegung. Math. An., 116 (1939): 555-570. 


Shohat, J. A. and Tamarkin, J. D. The Problem of Moments. AMS Surveys 1. American Mathematical 
Society, New York, 1943. 


Simon, B. The classical moment problem as a self-adjoint finite difference operator. Adv. Math., 137 
(1998): 82-203. 


Stieltjes, T. Recherches sur les fractions continue. Ann. Fac. Sc. Univ. Toulouse, 8 (1894—95); J1-J22; 9 
A5-A47. 


Stone, M. H. Linear Transformations in Hilbert Space and Their Applications to Analysis. AMS Coll. 
Publ., 15. American Mathematical Society, New York, 1932. 


von Neumann, J. Allgemeine eigenwertheorie Hermitescher Funktionaloperatoren. Math. An., 102 (1929): 
49-131. 


von Neumann, J. Mathematische Grundlagen der Quantenmechanic. Die Grundlehren der Math. Wiss. in 
Einzeldarstellung, 37. Springer, Berlin, 1932. 


SEMIGROUPS OF OPERATORS | 


The natural source of semigroups of operatdrs are gariai differential equations de- 


scribing evolution in time, and flows generated by dynamical systems. In this chap- 
ter we present an abstraction of these concrete situations, a point of view initiated by 
Hille. In the next two chapters we will present illustrations and applications of the 
theory. For a detailed treatment of this subject, we recommend Hille-Phillips, Yosida, 
and Goldstein’s excellent monograph. 


Definition. A one-parameter semigroup of operators over a complex Banach space 
X is a family of bounded linear operators Z(t), t > 0, each mapping X — X, with 
the following properties: 


Zt +s) = Z(t)Z(s) for all t,s > 0; Z(0) = I. (1) 
Equation (1) is the multiplicative property of exponential functions. We show next 


that under an additional continuity property, equation (1) characterizes exponential 
functions. 


Theorem 1. 
(i) Let G: X — X be a bounded linear map. Define Z(t) to be 
Z(t) = e's, t>0, (2) 
where the exponential of the operator is defined by the power series 


SO wan 
t'G 

d6 = 5 —. (3) 
n! 


Then Z(t) is a one-parameter semigroup of operators, continuous in the norm 
topology for operators. 


416 


SEMIGROUPS OF OPERATORS 417 


(ii) Conversely, let Z(t): X —> X be a one-parameter semigroup of operators 
continuous in the norm topology at t = 0: 


lim |Z(t) — I| = 0. (4) 
Then Z(t) is of form (2), G some bounded linear map X —> X. 
NOTE. Formula (2) defines Z(t) also for t < 0; these operators form a group. 
Proof. (i) is a special case of the functional calculus for operators; see chapter 17, 


theorem 4 (ii). So is part (ii) of theorem 1 above. Suppose that Z(t) is a one-parameter 
semigroup uniformly continuous at £ = 0. The logarithm function 


t? 
a t a Gi 


is analytic in the unit disk around 1. Therefore for any operator Z that differs from I 
by an operator of norm < | we can define 


logZ = log I+Z—-D=Z-I- 


Z—1)* 
Leal (5) 


Exercise 1. Denote by Z and W two operators mapping X into X that commute. 
Suppose that ||Z — Il] < ; and ||W —I]| < i Show that ||ZW —I|| < 1, and prove 
that 


log ZW = logZ + log W, 
where the logarithms are defined by formula (5). 


By (4), there is an a > O such that |Z(r) — Ij < l fort < a. We define L(t) = 
log Z(t), £ < a, by formula (5). The multiplicative property (1) implies that Z(t) and 
Z(s) commute; therefore we deduce from (1) and (5) that 


L(t +s) = L(t) + L(s), t+s<a. 


From this we deduce that for all rational t < a, t L(t) is independent of t; denote 
this operator by G: Da 


L(t) = tG, t rational, < qa 5n eee Gye 
The multiplicative property (1), that is, Z(t +h) — Z(f) = Z(t)[Z(h) — I], and 
continuity at £ = 0 show that Z(r) is continuous for all ¢ in the uniform topology. By 
(5), the same is true for L(t) where t < a. From this we deduce that (6) holds not 
just for t rational but all £ < a. Exponentiating gives (2) fort < a. By (1), it holds ~ 
for all t. a 


& 


418 SEMIGROUPS OF OPERATORS 
34.1 STRONGLY CONTINUOUS ONE-PARAMETER SEMIGROUPS 


The most interesting semigroups are not of the form (2) but are associated with dif- 
ferential equations, of which a typical example-is the heat-equation,—--------——--.—-_ ....- 


2 
ur — O,u = 0, 


u being a periodic function of x. A solution of the heat equation is uniquely deters... 
mined by specifying its initial values, namely its value at t = 0, The initial value _ 
can be specified as an arbitrary continuous function; the absolute value of the corre- 
sponding solution at any later time does not exceed the maximum absolute value of 
the prescribed initial values. Denote by Z(t) the operator relating the initial values 
“w(x; 0) of solutions to u(x,t) Clearly, these-operators form a-one=parameter-semi=————_ 
group in the sense of (1). Yet they are not of the form (2); for if they were, they could 
be extended to negative values of t to form a group of operators. But it is well known 
that such extension of solutions of the heat equation backward in time is not possible 
in general. 
The heat equation semigroup Z(t) is not uniformly continuous at tf = 0; yet it 
retains a less stringent kind of continuity that expresses the fact that each solution u 
is a continuous function of t: 


Definition. A one-parameter semigroup Z(t) is strongly continuous at t = 0 if 


= s—lim Z(t)x= x oe (7) 
130 
for all x in X. 


Theorem 2. Denote by Z(t) a one-parameter semigroup of operators that is strongly 
continuous att = 0. - 


(i) There exist constants b and k such that Z(t) is bounded in norm by 
IZ@ |< be". (8) 
(ii) Z(t)x is a strongly continuous function of t for every x in X. 


Proof. (i) We claim that |Z(t)| is uniformly bounded in some neighborhood of 
t = 0. To see this, suppose, to the contrary, that there is a sequence tj — 0 such that 
|Z(t;)| > oo. By the principle of uniform boundedness (see chapter 14, theorem 7) 
Z(t;)x could not converge to x for all x in X. This violates strong continuity at t = 0; 
therefore there exists an a > 0, b > 0 such that |Z(t)| < b fort <a. 

Any f can be decomposed as t = na +r,0 <r < a. By the semigroup property, 
Z(t) = Z” (a) Z(r). So 


IZ(t)| < Za) IZ] <b") < be, 


where k = 4 log b. This proves (8). 


STRONGLY CONTINUOUS ONE-PARAMETER SEMIGROUPS 419 


(ii) For any pair of positive numbers s < t we can, by the semigroup property, 
write 


Z(t)x — Z(s)x = Z(s)[Z(t — s)x — x]. 


Combining (7) and (8) strong continuity follows. g 


Next we show, speaking loosely, that a strongly continuous semigroup can be 
interpreted as an exponential function of an unbounded, not everywhere defined, 
operator. We start with a few facts and notions about unbounded operators: 


Definition. Let D be a dense linear subspace of a Banach space X, G a linear opera- 
tor mapping D into X. We call the operator G closed if whenever {xn} is a sequence 
of vectors such that x, —> x and Gx, — y, then x lies in D, and Gx = y. D is 
called the domain of G, and is denoted as D(G). 


According to the closed graph theorem, a closed linear operator defined at every 
point of a Banach space X is bounded. The operators we encounter in this chapter are 
defined only on a dense subspace of X and are unbounded. 


Definition. Let G be a closed operator with domain D(G); a complex number ¢ 
belongs to the resolvent set of G, denoted as p(G), if {I— G maps D(G) one-to-one 
onto X. The spectrum of G, denoted as ø (G), is the complement of its resolvent set. 


Suppose that ¢ is in the resolvent set of G; then ¢I — G is invertible. Its inverse 
is called the resolvent of G and is denoted as 


R(¢) = çI- G7}; 


it maps X onto D(G). Since G is closed, so is R(¢); it follows then from the closed 
graph theorem, theorem 12 of chapter 15, that the resolvent R(¢) is a bounded oper- 
ator. 


Exercise 2. Suppose that the resolvent set of G is not empty, and that ¢ belongs to 
p(G). Show that a complex number y belongs to the spectrum of G iff (¢ ~ y)~! 
belongs to the spectrum of R(¢). We can express this symbolically as 


- o (R(¢))-=-(¢ — 0 (G)) 7h —-- mG) 


One can think of (9) as an instance of the spectral mapping theorem for unbounded 
operators. i : 


Exercise 3. Deduce from (9) that the spectrum of G is a closed set in the complex 
plane. 


Next we define the transpose of an unbounded operator: 


420 SEMIGROUPS OF OPERATORS 


Definition. Let G be a densely defined, closed linear operator in a Banach space X. 
- Wedefine its 772 EPEE RE the relation 7 


(Gx, Ije Ge C9). E (10) 


The meaning of (10) is this: The domain of G’ consists of those linear functionals £ 
-for which the left Side‘of*(10) is a“bouiided linear furictional of x; -defined-on D (G) == 
Since D(G) is-dense in-X, this-bounded4inear-functional- canbe extendec-uniquely--——- 
from D(G) to all of X; this extension is denoted by G’£, its domain by D(G’). 


esd cise a Show That ne trans ispose of a -densely defined: linear operator is closed. 


The difficulty with the preceding definition is that the domain of G’ is an elusive 
thing. The following result is useful for pinning down the domain of G’. 


Theorem 3. Let X be a reflexive Banach space, G a densely defined, closed linear 
operator mapping D(G) into X, whose resolvent set is not empty. Then its transpose 
G’ is a densely defined, closed linear operator mapping D(G') into X', and p(G') = 
p(G). 


Proof. Let be a complex number in the resolvent set of G; subtract ¢ (x, £) from 
both sides of (10):..- 


(G = Dx, £) = (x, (G' — Ne). (10°) 


Denote the resolvent of G by R(¢) = —(G—CI)~! and define R’(¢) as the transpose 
of R(g): 


(Y, R'm) = (R(¢)y, m). l (10°) 


We claim that R’(¢) is the resolvent of G’. We show first that R’ (č) is one-to-one; 
for otherwise there would be a nonzero m such that R'(¢)m = 0. Setting this into 
(10”), we conclude that the range of R(¢) is annihilated by m; since the range of 
R(č) is D(G), assumed to be dense, this is a contradiction. Now define RT (¢) to 
be —G’ + cI’ and set R(¢)y = x and R'(¢)m = £ into (10”). Thus we obtain (10’). 

The domain of G’ as defined above is the range of R’/(¢). It remains to be shown 
that G’ cannot be extended any further. To see this, we note that the range of G’ — cI’ 
as defined above is all of X’. Therefore, if we extend G’ further, G’ — ¢]’ would an- 
nihilate some nonzero £. Setting this into (10’), we would conclude that £ annihilates 
the range of G — ¢], a contradiction since ¢ belongs to the resolvent of G. 

The last task is to show that the domain of G’, identified above as the range of 
R’(¢), is dense in X. If it were not, then there would be a y in X that annihilates 
the range of R’(¢) (it is at this point that we use the reflexivity of X). Setting this 
into (10), we conclude that R(¢)y = 0, contrary to the fact that ¢ belongs to the 
resolvent set of G. g 


STRONGLY CONTINUOUS ONE-PARAMETER SEMIGROUPS 421 


Exercise 5. Show that in the Hilbert space setting the conclusion of theorem 3 has 
to be modified as follows: e(G*) = @(G). 


Definition. Let Z(t) be a strongly continuous one-parameter semigroup of opera- 
tors: X — X, meaning that (1), (2), and (9) are satisfied. Its infinitesimal generator 
G is defined by 


Z(h)x —3 
Gx =s — lim fal ara 


h—>0 h oo) 


the domain of G, denoted as D(G), consists of all x for which the strong limit (11) 
exists. 


Theorem 4. Let Z(t) be a strongly continuous one-parameter semigroup, G its in- 
finitesimal generator. 


(i) G commutes with Z(t), in the sense that if x belongs to D(G), so does Z{t)x, 
and 


GZ(t)x = Z(t)Gx. (12) 


(ii) The domain of G is dense. 
riii) The domain of G", n any natural number, is dense. 

tiv) G is a closed operator. 

fy) All complex numbers ¢ whose real part is > k belong to the resolvent set of 


G, where k is the constant appearing in inequality (8). The resolvent of G is 
the Laplace transform of Z. 


Proof. (i) Using (1), we can factor the difference quotient in two ways: 


Z(t +h) — Z(t) i Zo? = ci Z(h) —1 


h h h ZUE, p a 


= Si henw-belongs-to-D(G), the-middle term converges as h —> 0 to Z(t)Gx. Therefore 
zhe terms on the right and left converge also, and we deduce from (13) that 


“Sar every x in D(G). This proves part (i). 
:37) We claim that an integrated form of (14), 


t 
Z(t)x -x = cf Z(s)x ds, (15) 

0 
¿s valid for all x in X. Since Z(s) is strongly continuous, the integrand on the right in 
15: is a continuous function of s; therefore the integral can be defined as the limit of 


d 
TEs = Z()Gx = GZU) (14) 


422 SEMIGROUPS OF OPERATORS 


“Riemann sums. To prove (15), we want to evaluate the action of G on this integral. 
Letting Z(/).act on the integrand, and using the semigroup property, we get 


eae be | Z(s\xds = + f [Z(s +h)x — Z(s)x] dx 
h 0 | h Jo 


$ y S a a aiaa Rent eae nat 1 s t+h-- E Hearse a 1 h 
= =f Z(s)x ds — if Z(s)x ds. 
pee = = . -=s ah peo h o 


Since Z(s)x is strongly continuous, the terms on the right converge to the left side 
of (15); this proves that for any x in X, fọ Z(s)xds belongs to D(G), and that (15) 


__ holds. 


—ltfoltows-from-the-strong-continuity-of-Zthat-for-any-in-X—- meme ee = 


f 


| aN pes 
lim zf Z(s)x ds = x; 
Ot Jo 


this proves that D(G) is dense in X. 
(iii) We-argue similarly about the domain of higher powers of G. Denote by ¢ any 
infinitely differentiable function on R supported on [0, 1]. For any x in X we define 


xg = [ow Z(s)x ds. 


The same argument as above shows that xg belongs to the domain of G, and that 


Gxg = - fø% Z(s)x ds. 


Clearly, xg belongs to the domain of all G”. We choose now a sequence of ¢; that 
are nonnegative, satisfy f ġ jds = 1, and whose support tends to zero. Appealing 
once more to strong continuity, we conclude that xg j tends to x. This proves that 
_D(G") is densein.X. ae ak 

(iv) We claim that the following integrated form of (14) is valid for all x in D(G): 


t 
Z(t)x — x = f Z(s)Gx ds, (15) 
0 


where the integral on the right is a Riemann integral. For proof we appeal to the basic 
theorem of calculus: if two functions whose value lies in a Banach space, and that 
have continuous strong derivatives, have the same derivative and are equal at t = 0, 
then they are equal for all t. 


Exercise 6. Prove the basic theorem of calculus for vector-valued functions. 
We apply the fundamental theorem to the functions on the two sides of (15°). Both 


are 0 fort = 0. By (14), the derivative of the function on the left is Z(t)Gx. The 
derivative of the indefinite integral on the right is also Z(t)Gx; this proves (15’). 


STRONGLY CONTINUOUS ONE-PARAMETER SEMIGROUPS . 423 


Let {xn} be a sequence of elements in the domain of G such that x, —- x, 
Gx, —> y. We claim that x lies in D(G), and that Gx = y. Take x to be x» in 
(15), 


t 
Z(t)Xn — Xn = | Z(s)Gx, ds, 
0 


and letn —> oo. Both sides converge, and their limits are equal: 


t 
Z(x-x = | Z(s)yds. 
0 


Divide by ż, and let t —> 0. The right side tends to y; this shows that x belongs to 
D(G) and that Gx = y. 
(v) The Laplace transform of Z is defined as 


L(t)x = | p e $* Z(s)x ds, (16) 
0 . 


where the integral on the right is defined as the limit of the Riemann integral from 
0 to T as T —> oo. Since, by (8), Z grows at most exponentially, |Z(s)| < b ekS it 
follows that the integral on the right of (16) converges when Re ¢ > k, and that 


oe a 
IL(¢)x| < | be®-RED)S Jx] ds = Ix]. 


b 
Ret —k 
This shows that L(¢) is a bounded operator and that 

IL(g)| < : (17) 
DIS Ret — k` 


We claim that L(¢) = R(¢), that is, the inverse of ¢f — G. To prove this, we look 
at the modified semigroup e~$'Z(r). It is easily verified that this is also a strongly 
continuous semigroup, and that its infinitesimal generator is G—¢ I, where G denotes 
the generator of the original semigroup. We apply (15) to the modified semigroup: 


é t 
---— e$! Zoren 55 Z(s)xds. 
0 


Suppose that Re £ > k; as t —> œ, the left side tends to —x. The integral on the 
right tends to L(¢)x; since G is a closed operator, we conclude that 


Seep a pc dN N E E 


This shows that L(¢) is a right inverse of (¢1— G). To deduce from this that L(¢) 
is the inverse of ¢ — G, we use (15’) in place of (15). o 


As we will see in chapters 35 and 36, there are plenty of semigroups of operators 
that are strongly, but not uniformly, continuous; however replacing strong with weak 
continuity doesn’t add much. 


424 SEMIGROUPS OF OPERATORS 


Theorem 5..Ler.X.denote.a-Banach space, Z(t): X —> X -a one-parameter family 
of bounded linear operators that is weakly continuous at t = 0: 


TETEN nb os ils ceca aos hie) tf) SCE 8) es: « ax 
i 


or all x in X and £ in X’. Then L(t) is strongly. cOntinuOus.oo--0n ne 
: 8!) 


Proof. This is a somewhat surprising result, for weak continuity demands much 


less than strong continuity. The proof, which we omit here, is rather tricky; see Hille- 
Phillips or Goldstein. E 


-34.2 THE GENERATION OF SEMIGROUPS 


Theorem 6. A strongly continuous semigroup of operators is uniquely determined 
by its infinitesimal generator. 


Proof. Suppose that Z(t) and W(t) have the same generator G. Take x in D(G); 
using the commutation rules (14) and (12), we get 


E wozo -= t)x = W@)GZ(s — t)x — W()GZ(s —1)x=0 > (18) 


for all x in D(G). Then, by the fundamental theorem of calculus, we conclude that 
W(s)x = Z(s)x. Since D(G) is dense, this holds for all x in X. o 


We show now how to reconstruct Z(t) from G in the case where Z(t) is a con- 
traction, that is, 


|Z(t)| <1 for all t > 0. (19) 


Theorem 7. 


LA 


(i) The infinitesimal generator G of a strongly continuous semigroup of contrac- 
tions has every positive, real 2. in its resolvent set, and 


A 1l 
ROSA- OTIS (20) 
(ii) Conversely, every densely defined unbounded operator G whose resolvent set 
includes the positive reals, and whose resolvent is bounded by (20), is the 
generator of a strongly continuous semigroup of contractions. 


This remarkable result is called the Hille-Yosida theorem, after its discoverers. 


Proof. Part (i) is just a restatement of inequality (17), with b = 1, k =0. 


THE GENERATION OF SEMIGROUPS 425 


We give Yosida’s proof of part (ii). It is based on approximating G by G, = 
nGR(v) and letting  — oo. The identity 


Gp = R(n) -nI (21) 


shows that G, is a bounded operator. We approximate Z(t) by Zp (t) = eC" where 
the exponential is defined as an infinite series. We show first: if (20) holds, then for 
all x in X, 


lim nR(n)x = x. (22) 
n—> 00 
We use the identity 
nR(n) — I = R(n)G 


and inequality (20) to deduce that for x in D(G), 
1 
|jaR(n)x — x| = [R(1)Gx| < —|Gx]. 
n 


This proves (22) for x in D(G). Since, by (20), the operator nR(n) has norm < | for 
all n, and since D(G) is dense in X, it follows that (22) holds for all x in X. 
Next we show that for all x in D(G), 


lim Gx = Gx. (23) 
n> OO 
By definition of Gy, for all x in D(G), Ghax = nGR(n)x = nR(n)Gx, so (23) 


follows from (22). 
By definition of Z,, we obtain, using formula (21) for G,,, that 


> a co (ny 
Zn (t) = ef Gn = eT hiten Rin), goal > -R” (n). 
5 m! 


Using (20), we deduce that each Z, (t) is a contraction: 


2m mo San 
“t 1 
Hosen (G) =e" z], (24) 


m! n 


.__To_estimate the difference of Z, and Z4, we use the fact that_G, and G, commute 
-with Zn and Zee 


E Ti, (s = HZR = Zn(s — )Ze(Q(Gg — Gn]. 


Using (24), we get that the norm of the right side is < |G,.x — Gx]; integrating this 
inequality with respect to t from 0 to.s; we deduce that - ee Seem ema ee me ne o ame 


IZn(s)x — Zg(s)x] < s|Gyx — Cex]. (25) 


426 SEMIGROUPS OF OPERATORS 


Combining (23) and (25), we deduce that for all x in D(G), the limit 


__ jim Zn(s)x = Z(s)x oo 29) 


exists, uniformly on bounded sets of s. It follows then from the uniform bounded- 
ness of |Z| that (26) exists for all x in X. It follows directly that since Z,(s) is a ` 
semigroup, so is Z(s):-Since-the convergence-in-(26)-is uniform-for-s=<-S,-strong—=-- 
=~ -continuity of Zy implies strong continuity-of Z.-Since by_(24) each Zy.isacontrac- ————. 
tion, so is their strong limit Z. 
It remains to show that the generator of Z is G. Apply (15’) to Zp: 


OO Za Oa aE] CRAS a 
5 0 
Suppose that x lies in D(G); taking the limit n — oo, we get, using (23), that 
t 
Z(t)x-x = | Z(s)Gx ds. 
0 


Denote by H the generator of Z(t). Dividing the above by ¢ and letting t —> 0, we 
conclude that D(H) includes D(G) and that H = G on D(G). In other words, His 
an extension of G. However, since, by theorem 4, A > 0 belongs to the resolvent set 
of both G and H, H cannot be a proper extension of G. o 


Another proof due to Hille is based on replacing the differential equation (14) by 
the backward difference equation 


Wt)x — W(t —h)x 


= GW(t)x. 
h 


Solve this for W(r):. 
W(t) = (I — AG)! W(t — h). 


Set h = t/n, and set t = h, 2h, ...; we obtain recursively 


t fE 
Wit) = (1- ta) (27) 
n 
We denote these operators as Zn. We claim: 


(i) Each Z(t) is a contraction. 
(ii) Zn converges strongly to a semigroup whose generator is G. 


(i) is an immediate consequence of (20). We omit the verification of (ii); see example 
1 in section 34.3. 


THE APPROXIMATION OF SEMIGROUPS 427 


34.3 THE APPROXIMATION OF SEMIGROUPS 


The result described in this section is of considerable importance in analyzing the 
convergence of discrete approximation to the solution of partial differential equa- 
tions. We will first state it in the language of semigroups. We start with a strongly 
continuous one-parameter semigroup of operators Z(t), as in the previous sections, 
with infinitesimal generator G. We discretise time into integer multiples of a small 
unit 1, imagined to tend to zero eventually, and define recursively discrete approxi- 
mations u") to Z(nh)x as follows: 


u@) = C,u™, uO = x, (28) 


C; a bounded operator depending on h. The function u(t) = Z(t)x satisfies the 
differential equation 


ou = Gx. (29) 


Therefore (w+! — w")/h ought to be an approximation to Gu. Computing this dif- 


ference quotient from (28), we are led to the following condition for the operator Ch, 
called consistency: 
C, -—1 
( — c) u(t) 
h 


for u(t) = Z(t)x, uniformly for all 0 < ¢ < 1, for a set of x that is dense in the 
domain D of G. 
The recursion (28) can be solved inductively to yield 


=0 (30) 


ul) = Ctx (31) 


as an approximation to Z(nh)x. We impose now a second condition, called-stability, 
that requires the approximations to depend boundedly on the initial vector x: 


IC}| < const  fornh<1 ` (32) 


and for some constant independent of n or h. 
The following result is called Lax’s equivalence theorem: 


Theorem 8; Ler Z(t) be a strongly continuous one-parameter semigroup aching on 777 


a Banach space X. Let (28) be a discrete-approximation-to-Z(r); satisfying the*con2 m= === 


sistency condition (30). Then ul = Cx tends to Z(t)x as nh tends to t, for all x 
in X, iff the scheme is stable in the sense of condition (32). 


Proof. The necessity of condition (32) follows from the principle of uniform 
boundedness; see theorem 7 of chapter 15. To prove sufficiency, we use the fol- ~ 
lowing version of a high school algebra identity, valid for noncommuting operators 


428 SEMIGROUPS OF OPERATORS 


A" - B" = = An VA — B) +A" (A — BB+. . + (A — BB"). 


We tale re a B= Zh), and let both sidès act on a vector x: 


Cx RCL = a cr Te 1G, - CLZ) . 


qo here we have used the semigroup property, and thé notation uO) = Zx = 
By the definition of infinitesimal generator, for every x in the domain of G, 


Z(h)x =x+hGx+sp, 
where sp is a vector whose norm |s;,| is o(/). So, for any t, 


Z(h)u(t) = LWZ = ZO Zh) x 
= Z(t)[x + hGx + Sy] = u(t) + hGu(t) + ZO)sn. 


We use this expression of Z(/)u to write 
(Cy, — Z(h)) u = (Cp — I — hG) u + ZO)sq.- 


Setting this into (33) gives 


nx —Z(nh)x = X Cp I" ICh —I— AGlu(jh) + Yc T ZGhsh. (83) 


Using the consistency condition (30), the estimate |s| = o(/2), the stability condition 
(32), and the boundedness of ||Z(r)||, we conclude that for t < 1 the right side of 
(33’) is bounded in norm by D l o(h) = no(h), which for nh < 1 tends to zero as 
h tends to zero. This proves that Cj'x tends to Z(r)x, as nh tends tot and h tends to 
zero, for all x for winch (30) kala: and for t < 1. The set of such x is dense in X; 

since the operators C} n and Z(t) are uniformly bounded, the conclusion holds for all 
x. The extension to ¢ > 1 is obvious. ; o 


The equivalence theorem is the framework of finite difference and other discrete 
approximations for solving time-dependent partial differential equations; see Lax 
and Richtmyer. There C, is a difference operator in the space variables, G a partial 
differential operator; consistency is verified, in the class of smooth functions, by 
using Taylor’s theorem. 

The equivalence theorem is valid beyond the setting of semigroups; it hold equally 
for approximations of form (28) to solutions of differential equations u; = Gu, 
where G is a linear operator that depends on t. 

The literature of this field is enormous (e.g., see Richtmyer and Morton). 

Here are some applications of theorem 8 in an abstract framework. 


THE APPROXIMATION OF SEMIGROUPS  ~ 429 


Example 1. Suppose that Z(t) is a semigroup of contractions, and let us choose for 
C, the backward difference operator (27) employed by Hille in the proof sketched 
at the end of the previous section: 


C, = 1—-AG)!. (34) 


We claim that this choice of C; leads to a consistent and stable difference scheme. 
Consistency follows from this string of algebraic identities: 


<A. -~G=(1-AG)"! ee) -G=(I-4G)'G-G 
1 

= (I- hG)7![G — (I — AG)G] = A — 4G)! 6? 

= (A7'I- G)~'e?. 


Clearly, if x belongs to the domain of G?, then it follows from estimate (20) that the 
consistency condition (30) is satisfied. Since we have shown in theorem, 4 that the 
domain of G? is dense, consistency follows. 


Exercise 7. Prove, using (20), that the scheme (34) is stable. 


Theorem 8 is only one of several approximation theorems for semigroups. Other 
versions are due to Trotter, Kato, and Chernoff (see sections 7 and 8 of chapter 1 
of Goldstein, and Strang). A useful consequence of Trotter’s theorem is Trotter’s 
product formula. 

Suppose that G and H are generators of strongly continuous semigroups T and S 
of contractions, and that the closure of G + H also generates such a semigroup Z. 
Then as nh —> t, h — 0, 


lim (T(A)S(A)]" x = Z(t)x (35) 


forall x in X. 

The significance of this result is that many infinitesimal generations of physical 
processes have a natural decomposition as a sum. Also in many problems it is ad- 
vantageous to compute approximately T(h) and S(/) by entirely different methods. 
Strang has pointed out that 


[t(Z)sr(5)| sr) seozeor-semr (3) e 


is a much better approximation to Z(nh).x than [T(A)S(/)]"x. 
Exercise 8. Why? 
It is natural to ask if it is possible in theorem 8 to dispense with the hypothesis 


that G is the generator of a strongly continuous semigroup, and instead deduce this 
from the existence of a stable difference scheme (28) that is consistent with G. In the 


D 


430 SEMIGROUPS OF OPERATORS 


_ context of ca pce equations, this would amount to proving the existence 


difference ee We offer only a weak abstract result; for simplicity we take X 
~eto be a Hilbert spaces“ 
Let G be a densely defined closed operator whose ee also is densely defined. 


A weak solution of the equation - an oh ate eet Se at sh ch eae ae 


ene au Gy wa tO) = ee pee G6) 7 


is defined as follows: 
-Let-w(t) be-any continuously differentiable vector-valued function of t whose 


= values-liein-thée domain of-G*, and Gwy is a continuous finction of T. Form thé 
scalar product of w(t) with (36) and integrate with respect to t from 0 to 1: 


l a 
[ (v, —u — Gu) dt=0 
0 ðt 


Integrate by parts, and use the adjointness of G and G*. Restrict w to vanish for 
t > 1; we get 


i (Fe +G*w, n) dt + (w(0), x) =0. (37) 


A function w(t) satisfying all conditions specified above is called an admissible test 
function. : 


Definition. An integrable vector-valued function u(t) that satisfies (37) for all ad- 
missible test functions w is called a weak solution of (36). 


We describe now briefly how approximations of form (28) as can be used toc con- 
struct weak solutions. We rewrite (28)-as 


u+!) — y) 


7 = Gru”, uO = x, (38). 
h 


where 


We replace the consistency condition (30) by its dual: 
Giw — G*w (39) 
for all w in the domain of G*. 


The solution of (38) is given by formula (31): u (n) = Cx. We retain the stability 
condition (32). 


THE APPROXIMATION OF SEMIGROUPS 431 


We introduce the Hilbert space H as the completion in the H-norm of continuous 
vector-valued functions w(t); the H-norm is defined as 


l 
jwily = f IJw(t)||2d¢. 


Given a solution of the difference equation (38), we extend it as a function of t by 
setting 


un(t)=u™ for nh<t<(n+1)h. 


Clearly, the H-norm of up is 
N 
eal =h > eM,  Nh=1. 
0 


The stability condition implies that || u“) || < const. for all n, nh < 1; it follows 
that ||, || 47 < const. for all h. We appeal now to the weak sequential compactness 
of bounded sets in Hilbert space (see chapter 10) and conclude that we can select 
a subsequence of h —> 0 so that {up} converges weakly to some limit u in H. We 


claim 
Theorem 9. The weak limit u is a weak solution of (36). 


Proof. To show this, take any admissible test function w(r), and take the scalar 
product of w) = w(nh) with (38). Multiply by A and sum with respect to n: 


D u+) = u™ 


Ay (e ae — (a Gaul) == 0, 


Sum by parts, and use the adjointness of G, and Gi. Since w(t) = 0 fort > 1, we 


sents se 


N (= aa we-d 


h 5 7 


+Gjw™, w) + (w(0), x) = 0. (0) 
1 


“Now teplace*in (40) the function-w(t) by w(t — s), and integrate over the interval 
0 < s < h. The result can be written as 


[ee + Gjw(0,u) dt + (w(0), x). (40°) 


Now let h tend to zero. along the previously selected subsequence. Since admissible 
test functions are differentiable, and G} w(t) tends to G* w(t) uniformly for allt < 1, 


432 SEMIGROUPS OF OPERATORS 


(40°) tends to... 


J (Zu + G*w, n) dt + (w(0), x) = 0. 


This proves that u is a weak solution of (36). o 


REMARK 1. The preceding argument shows that the weak limit lies in H; that is, it 


~is square integrable. It is easy to show that it is in fact bounded. Take any interval 
“Ta. b] in (6, 1]; according to the stability condition, ju ||? < const. for aln. It 
follows that 


b 
wee find = $juta- (b—a)const- - 
Stia 


a<nh<b x 


2 : . : 
The L--norm is lower semicontinuous under weak convergence, so 


b b 
f hua) at < tint f IEA ldt < (b — a)const. 
a a 
This proves that |ju(t)|| < const. for almost all r. 


REMARK 2. The consistency condition (39) can be relaxed by requiring (39) hold 
only on some dense linear subspace W of the domain of G*. This requires a corre- 
sponding change in what test functions are admissible. 


REMARK 3. A famous theorem of Friedrichs shows that for first-order partial differ- 
ential operators, a weak solution u is a strong solution, that is, an L? limit of genuine 
solutions. I don’t know of any comparable abstract result. 


34.4 PERTURBATION OF SEMIGROUPS 


Rellich has shown (see theorem 5 in chapter 33) that if one adds to a self-adjoint 
operator A another symmetric operator not too large compared to A, then the sum 
also is self-adjoint. A similar result holds for generators of contraction semigroups. 
Before we state it, we explain a reformulation due to Lumer and Phillips of the Hille- 
Yosida condition on the generators of contraction semigroups, For simplicity we state 
it only for semigroups acting on a Hilbert space. 


Lemma 10 (Lumer-Phillips). Let G denote a densely defined operator on a Hilbert 
space H whose resolvent set includes R+. Then inequality (20), necessary and suffi- 
cient for G to generate a semigroup of contractions, is equivalent to 


Re (x, Gx) <0 (20°) 


for all x in the domain of G. 


PERTURBATION OF SEMIGROUPS 433 


An operator satisfying (20’) is called dissipative. 


Proof. Condition (20) asserts that for all u in H and all à > 0, 
at a l 
IOL- Gl? < lel? 
Denote (AI — G)~!u as x. Then the inequality above can be rewritten as 
] 
(x,x) < a — Gx, àx — Gx). 


Expanding the right side, cancelling (x, x) on both sides, and multiplying by À, re- 
arranging terms, we get 


l 
(x, Gx) + (Gx, x) < zllGx]?. 


Letting A tend to co we obtain (20). 
The converse can be shown by running the proof backward. 


The following result is due to Trotter: 


Theorem 11. Suppose that G is the generator of a semigroup of contractions on a 
Hilbert space H. Let H be an operator with the following properties: 


(i) The domain of H includes the domain of G. 
(ii) H is dissipative. 
(iii) There exist numbers a and b,a < 1, such that 


|x|] < al]Gx|| + blii (41) 


forall x in the domain of G. Then G +: H, defined on the domain of G gener- 
ates a semigroup of contractions. pe a ye a 

Proof. Since G is the generator of a strongly continuous semigroup, it is a closed 

operator. We claim that so is G + H. To see this, let x» be a convergent sequence 


___Such that (G aN - H)xn = = Yn also converges. We write e Grn = =n - Hrn and form the 
difference i 


G(xXn — Xm) = Yn — Ym ~ H (Xn — Xm). 


Using inequality (41) on the right, we conclude that Gxp converges, and conse- 
quently also Hx, converges. Since G is closed, Gx, —> Gx where x, —> x. So 
x belongs to the domain of G. That Hx, converges to Hx follows from (41). 


434 SEMIGROUPS OF OPERATORS 


We claim that à sufficiently large positive belongs to the resolvent set of G + H. 


--First we. show that for every x in the domain of G, 


ae Ix = (101-6 + Eat. se 


According to the Lumer-Phillips lemma, G, being the generator of a semigroup of 


—-~-—~=contractions;-is-dissipative:-H-is dissipative by hypothesis; therefore’ so is their sum. 
---Inequality-(42) follows-from the-converse-part-of the-Iumer-Phillips lemma. 


' perpendicular to the range: for all x in D(G), 


If follows from (42) that the range of G + H — AI is closed. To show that it is 
the whole space we argue indirectly; if it were not, there would a nonzero vector v 


(G+H -—ADx, v) =0. (43) 


Since G generates a semigroup of contractions, G — AJ is invertible; so there is an x 
in the domain of G such that 


(G —ADx =v. (44) 
Setting this into (43), we get 
lvl? + Ely, v) = 0. 
Using the Schwarz inequality to estimate the second term, we get 
loll < Ex. (45) 
Using (44) to express v, and inequality (41) to estimate the right side of (45), we get 
|Gx ~ Axl] < allGx]] + dllx|. 
Square both sides; using the fact that G is dissipative, we get 
lGx l? + 22x? <a Gx |? + 2ab |x| xl] + OIA. (46) 


Since a < 1, for A large enough we conclude that ||x|| = 0. That makes x = 0 and 

v = (G — AI)x = 0; this proves that the range of G + H — Al is the whole space. 
This, combined with the Lumer-Phillips lemma, shows that G + H generates a 

semigroup of contractions. o 


A statement and proof of Trotter’s perturbation theorem in a Banach space setting 
is given in Goldstein. 


34.5 THE SPECTRAL THEORY OF SEMIGROUPS 


We saw at the beginning of this chapter that when the generator G is a bounded 
operator, the semigroup is the exponential of G: 


THE SPECTRAL THEORY OF SEMIGROUPS 435 
Z(t) = eS, (47) 

by the spectral mapping theorem, theorem 4 of chapter 18, 
o(Z(t)) = e70, (48) 


When G is unbounded, (47) holds only in a symbolic sense. The question is: does 
(48) hold? l 

It is easy to show that if y is an eigenvalue of G, then e’” is an eigenvalue of 
Z(t). To see this, let u be a corresponding eigenvector: Gu = yu. Then 


fe VZ(tu =Z(t)(G — yDu =0, 


which means that e~’'Z(t) is independent of t. Since its value att = O is u, it 
follows that e~”'Z(t)u = u for all t. This shows that e”' is an eigenvalue of Z(t). 

Ralph Phillips has shown that the same conclusion holds for any y in the spectrum 
of G: 


Theorem 12. Z(t) is a strongly continuous one-parameter semigroup of operators, 
G its generator. Then 


o(Z(t)) > ©, (48) 


Proof. The operators Z(t) and R(¢) commute for any ¢ in the resolvent set of G. 
Adjoin to this collection of operators their resolvents, and denote by A the closure 
in the uniform topology of the algebra generated by these operators. A is a Banach 
algebra, and the spectrum of Z and R as operators mapping X — X is the same as 
their spectrum as elements of the algebra A. 

We define the one-parameter family of operators V(t) by 


V(t) = R(O)Z(), tC>k; (49) 


t lies in the resolvent of G. We claim that V(t) depends continuously on ¢ in the 
uniform topology. To see this, substitute the formula (16) for R into (49): . 


oo 
= Vitryx = Zsye*Znxds--- E Shs ct a thane 2 at eRe 


co ie o0 
= [i Ds + t)xe™™ ds = eè! | Z(r)xe~>" dr; 
0 t 


continuity in the uniform topology follows. Since Z is a one-parameter semigroup 
that.commutes with R, it follows from (49) that 


RGV +s) = V(t) V(s). (50) 


436 ; SEMIGROUPS OF OPERATORS 


By (9), y belongs to the spectrum of G iff (¢ — y)~! belongs tò thé spectrum of — 
sacs RO J eho oe Se Aer 


a (RE) = -aG) (51) 
According to Gelfand’s theory (see theorent 14 of chapter 18) thé Spectiumi öf R is 


sons 0(R).={p(R)]-as.p ranges over.all homomorphisms.p-of A into.C. Combining this 
-With (51), we conclude that for every_y_in.a(G) there is a.p.A— C.such that... 


PRED =¢- y). (51’) 


- +01 'Let-p-act.on (50); since pis multiplicative, we gët 
p(R)p(V(t +s) = p(t) p(W(s)). (52) 
It follows from (51’) that p(R) 4 0. We define 
m(t) = (pV(t))/p(R) (53) 
and rewrite (52) as 
m(t +s) = m(t) m(s). (54) 
We have shown above that V(r) is a continuous function of t in the uniform topol- 
ogy; the homomorphisms are continuous in the uniform topology. Combining these, 
we conclude that p(V(r)), and therefore m(t), is a continuous function of R > C. 
As is well known, all continuous solutions of (54) are of the form 


m(t) = e! (55) 


Apply p to (49) to get p(V(t)) = p(R)p(Z(t)); combining this with (53) and (55) 
gives 


p(Z(1)) = e". (55’) 
Now multiply (16) by R: 
5 Oo 
R°x = f e RZ(s) x ds. 
0 


As shown above, RZ(s) is continuous in the uniform topology; therefore the integral 
above exists in the uniform topology 


R? = j —ts ; 
Foo e -"RZ(s) ds. 
0 


THE SPECTRAL THEORY OF SEMIGROUPS 437 


Apply p to both sides and use (55’): 


2 aa = ; p(R) 
pR = Í e~$* p(R) p(Z(s)) ds = pir) | ee ds = =—. 
0 0 C¢-K 
Dividing by p(R), we get p(R) = (¢—K)7! . Comparing this with (51'), we conclude 
that « = y. Setting this into (55’) gives 


p(Z(t)) = e". (56) 


According to Gelfand’s theory, theorem 14 of chapter 19, p(Z(t)) is in the spectrum 
of Z(t). Since y is any point in the spectrum of G, we conclude from (56) that 
o(Z(t)) contains e? (©), as asserted in (43°). 


The inclusion in (48’) is in some cases proper. Phillips has, however, shown 


Theorem 12’. Let Z(t) denote a strongly continuous one-parameter semigroup of 
operators, G its infinitesimal generator. Suppose that for some T > 0, Z(t) is uni- 
formly continuous for t > T. Then every nonzero point in the spectrum of Z(t) is of 
the form o(Z(t)) = ©, 


Proof. The spectrum of Z(t) is the range of the homomorphisms p:A-C. 
Applying p to Z(s + t) = Z(s)Z(t) and abbreviating p(Z(t)) as n(t), we get 


n(s +t) =n(s)n(t). 


Since Z(t) is assumed uniformly continuous for t > T, it follows from the definition 
that a(t) is continuous for £ > T. The only such solutions of this functional equation 
Æ 0 are the exponentials: n(t) = e”. The rest of the proof proceeds as that of 
theorem 12. 


Theorem 13. Let Z(t) be a strongly continuous one-parameter semigroup of oper- 
ators, G its infinitesimal generator. Suppose that for some T > 0, Z(T) is compact. 
Then z T a aah. aa O RA crac. ie A EPN AE E IEEE 


(i) the nonzero part of the spectrum of Z(t) is of the form o (Z(t)) = ef? ©), 
(ii) the spectrum of G consists of discrete points {yj}, Rey, > Rey, >...,Rey j 


we == OO. 


(iii) for every vector x, Z(t)x has an asymptotic expansion for large t of the form 
- jt f 
Z(t)x ~ Ee”! pj(t), (57) 
where pj; are polynomials in t whose coefficients are generalized eigenvectors 
of Gon tae Hag 


Exercise 9. Prove theorem 13. 


438 SEMIGROUPS OF OPERATORS 


We turn now to the question of the transpose of a semigroup and of its generator. 


Theorem 14. Let X bea ey Banach space, ww): X — X a strongly contin- 


uous one-parameter semigroup of operators. Then its transpose Z'(t) : X' > X' is 
likewise a strongly continuous one-parameter semigroup of operators, generated by 
ae uravispose K the gener rator of Z(t), feet a, 


-- Proof. By definition of the transpose—-——---. - m- a ~ =e m 


(aa £) = (x, Z'(1)£) (58) 


z ee is ely sequentially continuous. But then, by theorem 5, Z/ (t) is strongly 
continuous. l 

Denote the generator of Z’(r) by H, that of Z(t) by G. Choose x to be in D(G), 
£ in D(A). Differentiate (58) with respect to r and set t = 0: 


(Gx, £) = (x, H2). 


Comparing this with the definition of G’, we conclude that G’ is an extension of H. 
By theorem 4, all A > k belong to the resolvent set of H; the resolvent set of G’, by 
theorem 3 the same.as.that of G,.also contains these points. But then G’ cannot be a 
proper extension of H, and thus G’ is the generator of Z/(t). 


BIBLIOGRAPHY 


Chernoff, P. R. Note on product formulas for operator semigroups. J. Func. Anal., 2 (1968): 238-242. 


Friedrichs, K. O. The identity of ia Weak a and strong extension of differential operators. Trans. AMS, 55 
(1944): 132-151. 


Goldstein, J. A. Semigroups of Linear Operators and Applications. Oxford University Press, Oxford, 
1985. 


Hille, E. and Phillips, R. S. Functional Analysis and Semigroups. AMS Coll. Publ., 31. American Mathe- 
matical Society, New York, 1957. 


Kato, T. On the Trotter-Lie product formula. Proc. Japan Acad., 50 (1974): 694-698. 


Lax, P. D. and Richtmyer, R. D. Survey of the stability of linear finite difference equations. CPAM, 9 
(1956): 267-293. 


Lumer, G. and Phillips, R. S. Dissipative operators in a Banach space. Pac. J. Math., 11 (1961): 679-698. 
Phillips, R. S. Spectral theory for semigroups of linear operators. Trans. AMS, 74 (1951): 393-415. 


Richtmyer, R. D. and Morton, K. W. Difference Methods for Initial Value Problems, 2nd ed. Interscience, 
New York, 1967. 


Strang, G. Approximation of semigroups and the consistency of difference schemes. Proc. AMS, 20 
(1969): 1-7. 


BIBLIOGRAPHY 439 


Trotter, H. F. On the product of semigroups of operators. Proc. AMS, 10 (1959): 545-551. 
Trotter, H. F. Approximation of semi-groups of operators. Pac. J: Math., 8 (1958): 887-919. 


Yosida, K. On the differentiability and the representation of one-parameter semigroups of linear operators. 
J. Math. Soc. Jap., 1 (1948): 15-21. 


Yosida, K. Functional Analysis. Springer Verlag, 1965. 


l 3 5. eed are E 


GROUPS OF UNITARY 
OPERATORS... EE 


The mathematical landscape is full of groups of unitary operators. The ones we will 
consider in this chapter, strongly continuous one-parameter groups U (t), —co < 
t < oo, come mostly from three sources: processes where energy is conserved, such 
as those governed by wave equations of all sorts; processes where probability is 
preserved, for instance, ones governed by Schrödinger equations; and Hamiltonian 
and other measure-preserving flows. 


35.1 STONE’S THEOREM 
Theorem 1. Ler A be a self-adjoint operator acting on a Hilbert space H. 


(i) There exists a strongly continuous group U(t) of unitary operators whose 
infinitesimal generator is iA. 

(ii) Conversely, every strongly continuous group of unitary operators is generated 
by iA, A some self-adjoint operator. 


Proof. (i) We saw in theorem 5 of chapter 32 that every nonreal complex number 
z belongs to the resolvent set of any self-adjoint operator A, and [see (13) in chapter 
32] that the resolvent is bounded by 


RG) < Imz}. 


It follows from this that both iA and —iA satisfy the hypothesis of the Hille-Yosida 
theorem, theorem 7 in chapter 34, so both iA and —iA generate strongly continuous 
semigroups of contraction; denote these by U(r) and V(r). We claim that V(t) and 
U(t) are inverses. To see this, take any x in the domain of A and form 


UG) V(1)x. 


440 


STONE’S THEOREM 441 


This is a differentiable function of t, and its derivative is zero: 
U(HAV(t)x — U(L)AV(t)x = 0. 


` This proves that U(t)V(t)x is independent of t. Since it is x at £ = 0, it is x for all £. 
In other words, U(t) V(r) is the identity on the domain of A. Since the domain of A is 
dense in the Hilbert space H on which A acts, U(t)V(¢) is, by continuity, the identity 
on all of H. Reversing U and V shows that they are indeed inverses of each other. 

According to the Hille-Yosida theorem, both U(r) and V(t) are contractions. On 
the other hand, their product is I; this can only be if both are norm preserving. Since 
they are invertible, they are unitary; see section 31.7 for the basics of unitary opera- 
tors. . 

Define U(r) for t negative as V(—t). Clearly, U(s + t) = U(s)U(c) is satisfied for 
all real t and s, and dU(t)x/dt = AU(t)x for all x in the domain of A. 

(ii) We turn now to proving the converse proposition. Let U(t), -oo < t < œ, 
be a strongly continuous group of unitary operators. Then U(t) and V(t) = U(—r) 
are strongly continuous semigroups of contractions; their generators are negatives of 
each other, G and —G. It follows from theorem 7 in chapter 34 applied to G and —G 
that all nonzero real numbers belong to the resolvent set of G. Since U(f) is unitary, 


IUO? = (Ux, U@)x) = It. 
Choose x in the domain of G; differentiating and setting ¢ = 0, we get 
(Gx, x) + (x, Gx) = 0. 
Replacing x by x + y, we deduce that the real part of 
(Gx, y) + (x, Gy) = 0. (1) 


Replacing y by iy shows that (1) holds for all x and y in the domain of G. Equa- 
tion (1) says that G is antisymmetric. It follows that G* is an extension of —G. 
According to the Hilbert space version of theorem 3 in chapter 34 (see exercise 5), 
_the-resolvent set of G* is the complex conjugate of the resolvent set of G. We have 
shown above that all nonzero real numbers belong to the resolvent set of G; there- 
fore they belong to the resolvent set of G*. Since they also belong to the resolvent 
set of —G, the one cannot be an extension of the other. Therefore G* = —G. E 


The-spectral resolution of a self-adjoint transformation A can be thought of as 
defining the functions cs(A), where cg is the characteristic function of the Borel 
set S. The three constructions we gave in chapter 33 for the spectral resolution of 
A all started with a more limited functional calculus. Section 33.1 was based on 
the resolvent (A — zI)~!, z nonreal; in section 33.2 we used the Cayley transform 
(A —iJ)(A+ iDT!; section 33.3 was based on all f(A), f continuous on R U oo. 
Stone’s theorem can be interpreted as defining the exponential functions eA: we 
sketch how to build out of this functional calculus the spectral resolution of A. 


442 i GROUPS OF UNITARY OPERATORS 


Lemma 2. Let U(t) be a strongly continuous one-parameter group of unitary oper- 
ators acting on a Hilbert space H. Let u be any vector in, H; then the function 


as a(t) = (U(t)u; u) ape E > =O) == igi 


is positive definite in the sense of Bochner; see chapter 14, section 14.4. That is, a(t) 
is skew symmetric, continuous, and 


l -Jalti — tojók 20 


for all choice oft\, ..., ty on R, and all complex numbers $1,..., ỌN. 


-— Proof: By the definition (30), the group property of U (t), and_the fact that __ 
U(—1) = U(t)! = U* (t), we have . 


Satj -bjb = SOC; — wu, udje = YUH)” We u, wbx 
= J Uau. Ueadund;d = (X Uau, J HUn) 


- Eeo. 


clearly a nonnegative quantity. g 


According to Bochner’s theorem, theorem 8 in chapter 14, a positive definite func- 
tion is the Fourier transform of a nonnegative measure; so the function (2) can be 
represented as 


(U(r)u, u) = fe amy, (3) 


Setting t = 0, we obtain ||u ||? = m(R). The measure m depends on the vector u; it 
is the Fourier transform of (U(t)u, u) and is therefore uniquely determined by u. 

The right side of (2) is a quadratic function of u; we associate with it a skew 
bilinear function 


(U(t)u, v) = Je dm(à; u, v), (4) 
where the measures m (u, v) are formed out of the measures m (u) by polarization. 
Lemma 3. The measure m(u, v) has the following properties: 


(i) m(u, v) = m(u). 

(ii) m(v, u) = m(u, v). 
(iii) m(u, v) depends linearly on u, skew linearly on v. 
(iv) pn(S, u, v)| < Juli lull for any Borel set S. 


Proof. The proof is the same as of theorem 17 in chapter 31. E 


ERGODIC THEORY 443 


We appeal now to theorem 1 of chapter 31 on bounded sesquilinear forms to 
conclude that for any Borel set S, 


m(S, u, v) = (E(S)u, v), 


where E(S) is a bounded, symmetric operator, ||E|| < 1. So we can rewrite (5) as 
(U(t)u, v) = J eò d(Eu, v). (5) 


We claim that E is the resolution of the identity for the operator A, meaning that 
it has properties stated in theorem 1 of chapter 32. The proof is analogous to the 
arguments used in chapter 31, section 31.7, on the spectral resolution of a single 
unitary operator. We leave it to the reader to complete the details. O 


35.2 ERGODIC THEORY 
Strongly continuous groups of unitary operators serve as a good setting for the mean 
ergodic theorem. We present the abstract theorem in this section, the connection with 


ergodicity of dynamical systems in the next section. 


Theorem 4 (von Neumann). Let U(t) be a strongly continuous one-parameter 
group of unitary operators mapping a Hilbert space H onto itself. 


(i) Denote by F the set of all vectors f in H that remain fixed under the action 
of the group: U(t)f = f for allt. Then F is a closed linear subspace of H. 


(ii) Denote by M(t) the averaging operator 


1 f 
M(t)g = Fi U(s)g ds. (6) 


Then as t — œ, M(t) converges strongly, meaning that for every g in H, 


s—_lim_M(t)g = Pg 
t-co 


exists, and the strong limit P is the orthogonal projection onto F. 


_______-Proof__The_original proof of von Neumann _relied.on.the spectral. resolution of.un-...... ... 


-pounded self-adjoint operators. The simple proof presented here is due to Eberhardt 
Hopf. 

That F is a closed linear subspace is obvious, for the nullspace of the continuous 
operator U(t) — I is closed and linear and so is their intersection F'. For part (ii) we 
need first 
Lemma 5. Let U be a unitary operator, E the nullspace of U — I, and R its range. 
We claim that E is the orthogonal complement of R. 


co 


444 GROUPS OF UNITARY OPERATORS 


Proof. For any two vectors g, h in H, 
(U — Ig, h) = (g, (U* — Dh). 
Using U*U = I, see section 31.7, we can rewrite this as 


(U= Dg, h) = (g, € — U)U*h), | 


“which shows that if g-is orthogonal to the range R; the right side-is zero for all-h. 


Then so is the left side; therefore g lies in the nullspace E. The converse can be 
proved conversely. o 


We take U = U(r), r any real number. According to the orthogonal decomposi- 
tion theorem applied to the closure R of the range of U — I, every g in H can be 
decomposed as 


g=e+z, (7) 


where z belongs to R and e is orthogonal to R. According to lemma 5, e belongs to 
the nullspace E of U(r) — I. We claim that M(r)z tends to zero as f tends to co. To 
see this approximate z arbitrarily closely by a vector ze in R: 


lz— zell <€, ze =(U(r)—Dh. 


From the definition (6) of M and because ||U(s)|] = 1, it follows that ||/M(r)|| < 1. 
So for all z, 


IMO) — M(r)zell = IMOG — ze) II < €. 
Using the definition (6) of M and that U(s)U(r) = Us +r), we write 
M(t)ze = MO) (UC) — Dh = MOUE) = Mh 
f U(s)U(r)h ds — f U(s)h ds 


] I+r 1 r 
zf U(s)h ds — f U(s)h ds. 
tJi t Jo 


‘Each term on the right side is in norm less than r||/||/t; this shows that ||M(t)ze || 


tends to zero as f —> oo. Since M(t)ze differs by less than € from M(f)z, and e is 
arbitrary, it follows that ||M(z)z|| tends to zero as f > oc. 

Next we show that for e in the nullspace E of U(r) — I, M(t)e tends strongly to a 
limit as £ —> oo, and that this limit belongs to E. U(t + r)e = U(t)U(r)e = U(t)e 
shows that U(r)e is a periodic function of t with period r; therefore, writing £ mod r 
as? =nr +q,0 <q <r,n some natural number, we have 


n 


1 r 1 f4 
Mine = f U(s)eds = zf U(s)eds + z U(s)e ds. 
t Jo t JO t Jo 


THE KOOPMAN GROUP 445 


Clearly, as £ — œœ, the second term on the right tends to zero, and the first term 
tends to 


zf U(s)eds. (8) 
r Jo 


Since U(s) commutes with U(r) — I, it maps its nullspace E into Æ. Therefore (8) 
belongs to £. 

According to (7), every g in H can be decomposed as e + z; it follows that the 
limit M(t)g as t — oo exists for every g in H, and that the limit belongs to E, the 
set of vectors fixed under U(r). Since r is arbitrary, it follows that the limit belongs 
to F, the setof vectors fixed under the action of every operator of the group. 

The operators U(s) act as the identity on F’; we claim that they map the orthogonal 
complement of F into the orthogonal complement of F. To see this, suppose that w 
is orthogonal to F; the relation 


Uew, f) = w, U9) = w, Us) ) = Ow, f) =0 


shows that then U(s)w is orthogonal to F. Since the operators M(t) defined in (6) 
are averages of the operators U(s), they also act as the identity on F and map the 
orthogonal complement of F into itself. The same is true of the strong limit of M(¢). 
Since we have shown that this strong limit maps H into F, it follows that the strong 
limit maps the orthogonal complement of F into 0. This shows the s — lim;—.¢9 M(t) 
is orthogonal projection onto F. E 


35.3 THE KOOPMAN GROUP 


Let M be an open, compact differentiable manifold, with some prescribed volume 
element V. We wish to study volume-preserving flows along vector fields D, namely 
solutions of the differential equation 


d. 


x * 
Beas BE Ae DS Caste einen ee (9) 


Among such flows are those along Hamiltonian vector fields. Denote by x(y; £) the 
position at time £ of that solution of (9) whose value at time zero is y, the mappings 


y > x(y;f) are volume preserving. Since M is compact, its volume is finite. Thé ~~~ ~~ 


~~ vector field Dis independent of f+ therefore == 
xyz; s), t) =x(z;s +t). © (9') 


Bernard Koopman has associated with every such flow a one-parameter group of 
unitary operators, acting on the Hilbert space of square integrable functions g on M: 


(U(t)g)(y) = g(x(y; t)). (10) 


ee 


446 GROUPS OF UNITARY OPERATORS 


It follows from the volume-preserving character of the flow that the operators U(r) 
defined above preserve the L?-norm; in fact they preserve all the L?-norms, 1 < 
DS Wee en 

What does the mean ergodic theorem, theorem 4, say about these flows; that is, 


_.what can we say here about the space F of functions that remain fixed under the op- 


erators (10)? Obviously.every constant function remains fixed; are there any others? 


“Let f be stich’a function; we may take f to be real. Let c be any real number, the 


setS, of points-yin M where f(yy<c is then invariant under the flow; that is, if 
y belongs to Se, so do all points x(y; t). If f is nonconstant, there is a value of c for 
which the invariant set Se is nontrivial in the sense that neither Se nor its complement 
in M has measure zero. Conversely, if there is a nontrivial set S. invariant under the 
flow, its characteristic function 


a 


_ fl ifyinS 
ra= if not 


remains fixed under the group (10). So we have shown: 

Only the constant functions on M are fixed under all operators of the Koopman 
group associated with a given volume preserving flow on M iff M has no nontrivial 
measurable subset that is invariant under the flow. 

A flow that has no nontrivial measurable subsets is called metrically transitive. 

Suppose that there are no nontrivial invariant sets under the flow (9). Then F 
consists of constants, and according to the mean ergodic theorem, for every function 
gin L?(M), M(t)g tends to the projection of g into the space of constants. What 
is that projection? Clearly, it is the mean value of g over M with respect to the 
prescribed volume: 


1 


The projection (11)-is-called the space average of the function g; the limits of the 
averaging operators M defined by (6) are called time averages of g. So, loosely 
speaking, the mean ergodic theorem applied to the Koopman group asserts that the 
space average and the time average of an arbitrary L? function are equal, provided 
that the flow (9) in question has no nontrivial invariant subset. 

In statistical mechanics the manifold M is phase space, the vector field derived 
from the Hamiltonian of N interacting particles, N ~ 1025. The time average is inter- 
preted as the measured value of a function g of N variables; the time of measurement 
is large on the scale relevant in theorem 4. The significance of the ergodic theorem, 
first proposed in a rudimentary form by Ludwig Boltzmann, is that instead of having 
to solve a differential equation of form (9), involving about 1073 unknown functions, 
we merely have to evaluate an integral (11) over a 1072-dimensional manifold. 

Jack Schwartz has pointed out that the functions g whose measured values have 
thermodynamic significance are highly symmetric functions of their 1073 variables. 
The equality of the time and space average of such highly special functions might 
well be due to reasons other than the ergodic theorem. 


THE WAVE EQUATION 447 


We remark that in general, it is very hard to decide which flows have nontrivial 
invariant subsets and which don’t. An amusing example is given in Lax. 
35.4 THE WAVE EQUATION 
The classical wave equation is 
ltt — Au = 0, 


A being the Laplace operator: 


A= 0; +9, +82; 


here subscripts denote partial derivatives. Let u be a solution of the wave equation 
in full space-time R? x R, that tends to zero, together with its first derivatives, suf- 
ficiently rapidly as x? + y? +z? — oo. For such solutions the law of conservation 
of anergy may be derived as follows: multiply the wave equation by us, and integrate 
over R?: 


[ Up (typ — Au) dV = 0. 
R3 
Integration by parts changes this to 
[vst + yy +... + uyuz) dV =0. 


The integral is the t-derivative of 
i 
E(t) = sfe + u? + u? +u?) dV; 


so we conclude that E(t) is independent of ¢. Such a quantity is called a constant of 
motion, or a conserved quantity. EESTE EEE EONA 
E(t) is called energy ; it is a quadratic functional of u u and uy. Its square root is 
the energy norm. 
The initial data u(o) and u;(0) completely determine the solution for all time. 
——-—If-any-solutions-have_the-same—initial data, their_difference,-also-a-_solution-of-the-———___ 
wave. equation, has.zero-initial.data, and_so_zero-initial energy. -By.~the. conservation, -s 
of energy, the difference has zero energy at all times, and so zero data at all times. 
~~ We denote by H the completion in the energy norm:of space of all initial data 
{u(o), ut (0)} that have finite energy. H is a Hilbert space. 
Denote by U(t) the solution operators, that is, the operator that maps initial data 
into data at time ft: ee aes 


U(t) : {ut(o), up(0)} —> {u(t), ui (t)}. 


a 


448 . GROUPS OF UNITARY OPERATORS . 


It is not hard to show (e.g., by taking the spatial Fourier transform) that the initial 
value problem, that is, the problem of finding a solution in full space-time of the 
_ wave equation with prescribed initial data u (o) and u; (0), can be solved if the initial 
-data are sufficiently smooth and have compact support. So the operators U(r) are 
well defined on such data. Since these data form a dense subspace of all data H, 
we can, by continuity, define’ the operator U(r) on all of H. It is not hard to show 
-—-+»--that the-extended operators form a strongly continuous group of unitary operators. 
-———-—- -9trong-continuity.can be.verified for.smooth solutions with smooth initial data, and 
then extended by continuity to all initial data with finite energy. The group property 
U(s +r) = U(s)U(r) expresses the fact that if w(x, y, z, t) is a solution of the wave 
equation, s so 0 is u u(x, y,z,¢ — s). Unitarity is macys a A restatement pi the:c conserva- 


accomplished by solving the initial value problem backward. 


Exercise I. What is the infinitesimal generator of the group formed by the solution 
operator of the wave equation? 


We describe now an important extension: the study of solutions of the wave equa- 
tion for all time but not in the whole three-dimensional space, only in the exterior of 
some obstacle B. On the obstacle all solutions are required to satisfy the boundary 
condition: u = 0 on the boundary of B. 

The conservation of energy is derived the same way as before. This time integra- 
tion by parts produces the boundary term 


f UtundS, un = normal derivative of u, 
ðB 


which is zero for all u that vanish on the boundary. Note that we could also have 
chosen to impose the boundary condition u,, = 0, to make the boundary term vanish. 
In either case the laws of conservation of energy follows. 

We then proceed-as before to construct the solution operators U(t), which form 
a strongly continuous group of unitary operators in the energy norm. The only new 
complication is that it is harder to prove the existence for all times of solutions of 
the wave equation with prescribed smooth, initial data, and satisfying the boundary 
condition. But this is only a technical difficulty, which should not be allowed to 
obscure the simple underlying structure of solutions with finite energy in the exterior 
of an obstacle. 


35.5 TRANSLATION REPRESENTATION 
The spectral representation of a self-adjoint operator A acting in a Hilbert space H 


is described in chapter 32. When A has spectrum of multiplicity 1, there exists a 
nonnegative measure m on R and a unitary mapping between H and L?(R, m): 


H+ > LR, m) 


TRANSLATION REPRESENTATION 449 


such that the action of A is represented as multiplication by the variable AeR. When 
the spectrum of A is multiple, the spectral representation is a unitary mapping be- 
tween H and a possibly infinite Cartesian product of L? spaces on R equipped with 
various nonnegative measures: 


H «> TIL?(R, mj). 


The action of A is represented by multiplication by AéR in each component. 

According to Stone’s theorem, i times a self-adjoint operator A generates a 
strongly continuous one-parameter group of unitary operator U(t). We denote these 
operators symbolically as U(t) = expiAt. 


Theorem 6. In a spectral representation for a self-adjoint operator A, the action of 
U(t) = expiAt is represented as multiplication by exp idt. 


Proof. The proof of Stone’s theorem is based on the Hille-Yosida theorem, theo- 
rem 7 of chapter 34. Yosida’s proof of this theorem constructs expiAz as the strong 
limit : 


expiAs =s — lim éC, — Ga =n? R(n) — nl. (12) 


Hn—> CO 


In the course of deriving the spectral representation, chapter 31, section 31.5, we 
have shown that the action of R(m) = (n — iA)7! is represented as multiplication 
by (n — iA)7!. It follows that the action of the right side of (12) is represented as 
multiplication by e” /0=iA), 

Let u be any vector in H, represented by the functions {k;(A)}. Then ef Gul y is 
represented by fees hes (A)}; the L?(mj;) limit as n —> co of these functions 
is {ek j(A)}. o 


We consider now the case where the spectrum is absolutely continuous and of uni- 
form multiplicity on R. That means that all the measures m; in (18) are absolutely 
continuous with respect to Lebesgue méasure and their support is all of R. As re- 
marked at the end of chapter 31, in this case the measures entering the representation 
can be taken as Lebesgue measure on all of R, and (18)-can be rewritten as 


2 


Bele ENE ETERNA _ AW => ILAR 


It is convenient to put together a Cartesian product of L?(R) spaces as a single 
space L?(N, R) consisting of L? functions whose values: lie in an auxiliary Hilbert 
space N; the dimension of N equals the number of components in the Cartesian 
product, possibly oo. Note that the dimension of N is the multiplicity of the spec- 
trum of A. So we can write the spectral representation as e thas 


H «> L?(N, R). (13) 


ee 


450 f GROUPS OF UNITARY OPERATORS 


By taking a Fourier inverse, we. can obtain from (13) another representation ` 


a alanis iam ea a H esy LYN. R). : (13’) 


"That is is, . if. hey vector u in H is eepresénted: in (13) by the function f(A), then (13’) 
assigns to u the representation 


K(x) sa [Foye di. 

It follows from theorem 6 that (exp i At)u is represented in (13) by exp(iAt) f(A), 

and so in (13’) by k(x — t). For this reason (13’) is called a translation represen- 
_tation_of H for the unitary-group generated-by-iA. Conversely,from-a-translation ——-- 

representation we can construct a spectral representation by Fourier inversion. 

We turn now to a geometrical characterization of translation representations due 
to Sinai. Let H be a Hilbert space, U(t) a strongly continuous one-parameter group 
of unitary operators on H that has a translation representation. Denote by F the 
subspace of H consisting of vectors represented by functions supported on R_: 


F <> L?(N,R_). (14) 


Clearly, U(r) F consists of vectors represented by functions supported on (—oo, r). 
It follows then that the one-parameter family U(r) F increases as r does, going from 
sae {0} to H as r goes from —oo to co: We express this more precisely thus: 


UG)F CF forr < 0, (15a) 
NUCr)F = {0}, (15b) 
UU(r)F = H. (15c) 


Theorem 7. Conversely, let H be a Hilbert space, U(t) a strongly continuous one- 
parameter group of unitary operators mapping H into H. Let F be a closed sub- 
space of H, and suppose that all three conditions (15) are satisfied. Then H has a ~ 
translation representation (13') for U(t) where F is given by (14). 


Exercise 2. Show that U(s)F c U(t)F whens <t. 


Sinai deduced this result from von Neumann’s theorem about the Heisenberg 
commutation relation. Phillips and the author gave an independent proof and showed 
how to deduce from it the result of von Neumann; we will present that in the next 
section. We give here the proof due to Phillips and Lax; it is a little technical, but 
pretty, at least in a parent’s eyes. 


Proof. We will deduce theorem 7 from another representation theorem: 


Theorem 8. Let K be a Hilbert space, Z(t) a strongly continuous one-parameter 
semigroup of contractions mapping K into K. Assume furthermore that Z(t) tends 


TRANSLATION REPRESENTATION 451 


to 0 strongly as t tends to oo: 

lim Z(t)k =0 (16) 

s-00 
for all k in K. Then K can be unitarily represented as a closed subspace of 
L?(N, R_), N some auxiliary Hilbert space, so that the action of Z(t) is trans- 

lation to the right by t, restricted to R_. 
Proof. Denote the generator of the semigroup Z by G, D(G) the domain of G. 
We want to define first a representation for vectors g in D(G), and then extend this 


representation by continuity to all of K. To any g in D(G), we assign the vector- 
valued function y (s$) as 


y(s) = Z(—s)g, s<0. (17) 
Thus the function y is defined on R and its values lie in D. We define now a new 


norm in D, denoted as ||g||y, so that the L? norm of y is equal to ||g||. That is, we 
require that for all g in D(G), 


0 fore) 
lel? = J keii [ IZ(e) el dr. (18) 
—co 0 
Since Z maps D into D, this must hold for g replaced by Z(h)g, 
2 pa 2 oe 9 
IZh)? = f IZ(s + h)gl?, = | IZ(s)gII3, ds. a8) 
1 


Differentiate both sides of (18’) and set h = 0; we get 


(Gg, 8) + (g, Gg) = — Ilgil% , (19) 


. We take this as the definition of ||elly. Note that since we assumed that the Z(t) are 
contractions, the left side of (18’) is a nonincreasing function of A, and therefore the 
left side of (19) is nonpositive. This shows that our new norm ||g||,v is nonnegative. 
We define the auxiliary Hilbert space N as the completion of D in the norm ||g|ly, 
..-modulo.the-vectors. of.norm.zero. ee eet Ye 


we can use definition (17) and (19); writing s = —t, we get 


d 
lv (shy = IZ) gil}, = -2Re(GZ()g, Z()g) = rr IZ@gl?. 19) 


We integrate this with respect to ¢ from 0 to r, and using the hypothesis that ||Z(r) g|| 
tends to 0 as r tends to oo, we obtain (18). 


We verify now that with this definition of N. and the N- -norm, he representation 
(17) is an isometry, namely that (18) holds for all g in D. Since Z(t)g belongs to D, 


452 ` GROUPS OF UNITARY OPERATORS 


~ Since D is dense in K, we can extend by continuity the representation (17) as an 
isometry of all K. Clearly, in this representation Z(r) acts as translation followed by 
restriction to R_. o 


We return now to theorem 7. Denote by P projection onto the subspace F and 
: ~~-define the operators Z(t) by 


Z(t)= PUG), 2 0. te (20) 
Lemma 9. Suppose that U has the properties postulated in (15). Then Z(t) defined 


by (20) form a strongly continuous one-parameter semigroup of contractions map- 
-ping-F—-F-and-Z(t)-tends-strongly to 0 as t-—-00. - > nabs Ss 


Proof. Since U(t) is strongly continuous, so is Z(t). Since U(r) and P are con- 
tractions, so is Z(t). To show that Z(r) is a semigroup, take any vector f in F; by 
definition of Z, 


Z(r)Z(s) f = PU(r)PU(s) f = PU(r)[U(s) f + p] 
= PU(r +5) f + PU(r)p = Z(r +s) f + PU(r)p; (21) 


here p denotes some vector orthogonal to F. We claim that also U(r) p is orthogonal 
> F forr > 0. To see this, take any vector f in F; since U(r) is unitary, U* (r) = 
~I 
(r) = U(-r): 


(F. Ur) p) = (U* (r) f, p) = Ur) f, p). 


According to assumption (15a), U(—r) f belongs to F; since p is orthogonal to F, 
the last term on the right is zero. Since P maps vectors orthogonal to F into 0, it 
follows that the last term on the right in (21) is zero; this shows that the Z(t) form a 
semigroup. 

To prove that Z(t) tends strongly to zero as s tends to oo, we first show that the 
set of vectors of form 


UMFH, t <0, (22) 


where F+ denotes the orthogonal complement of F in H, is dense in H. We argue 
indirectly and suppose not; then there would be a nonzero vector v in H orthogonal 
to all vectors U(t) F+. The unitary operator U(r) maps orthogonal complements into 
orthogonal complements. The orthogonal complement of U(r) F + is U(t)F; so it 
follows that v belongs to U(r) F. This holds for all t < 0, but it contradicts (15b). 
Given any f in F and any € > 0, we can, according to the above, find g in F+ and 
r < 0 such that || f — U(r)g|| < e. Therefore, writing s = —r, we have ||U(s) f — 
gll < €. The projection P maps g into zero. Therefore applying the projection P, we 
conclude that ||Z(s) f || = ||PU(s) f|| < €. Since the operators Z are contractions, 
IZEI < e forall? >s. o 


TRANSLATION REPRESENTATION 453 


It follows from lemma 9 that theorem 8 is applicable to the semigroup Z(t) = 
PU(t) defined on F. There is an isometric translation representation that assigns to 
each f in F a function @ in L?(N, R_): 


f <=> o(s) (23) 
so that for any t > 0, 
PU(t) f <> c(t)d(s — t), (23^) 


where c is the characteristic function of R—. The representation is isometric: 


0 
IFI? = J le(s) ds, (24) 
—co 
—t 
PU) FI? = J lo) ds. (25) 


We extend the representation to all vectors in H of form U(t) f, f in F, by setting 
Unf — (s — t). (26) 


It follows from (24) and the isometry of U(r) that this assignment is isometric. We 
claim that (26) is consistent with (23’); that is, if U(r) f belongs to F, then the right 
sides of (23’) and (26) are equal. Clearly, this is so iff the function @ representing f 
is zero for —t < s < 0. That this condition is satisfied can be seen by comparing 
(24) and (25). The left sides are equal; therefore so are the right sides, which can be 
only if @ = 0 in (—ż, 0). 

According to (15c), vectors of form U(r) f, f in F, are dense in H. Therefore the 
representation (26) can be extended by continuity to all u in H: 


u 4—> k(x). 


This representation is isometric: 


co 
lul? = f Ik) dx. 
co 


transmutes the action of U into translation, _ : Tard EN PA ere eee a 


Ulu <> k(x — t), 
and the action of P into truncation, 
Pu <—> c(s)k(s). 


It is not hard to show that the range of this representation is all of L?(N , R); see 
Lax-Phillips, 1981. o 


454 GROUPS OF UNITARY OPERATORS 


The conditions under which a unitary group has a translation representation may ~ 
appear somewhat special. Nevertheless, there are natural, interesting, and nontrivial 
examples coming from wave propagation as described in section 35.4. There the 


` underlying Hilbert space H is the set of all initial data {u(x, 0), u, (x, 0)} defined in 


R? with the energy norm 


. Juco), unos = f (Souk +4?) de. 


The group U(t) is the group of solution operators for the wave equation 


Ug — Au =0, 
Uy fi, Oi TOY = (ux, 1), up Oye 7 ae See 


The unitary character of U(r) expresses the conservation of energy and the re- 
versibility of time. 

The role of the distinguished subspace F is taken by the so-called incoming initial 
data. These are initial data of incoming solutions u(x, t), which are zero inside the 
backward light cone : 


u(x,t) =0 for |x| < —t. 


It is far from obvious that there are any incoming solutions at all. We show now 
how to construct them by relying on Huygens’s principle for wave propagation the 
three-dimensional space. Conceptually this principle says that solutions of the wave 
equation propagate information with speed equal to 1. Technically this means that 
the value of a solution u(y, s), (y, s) in R? x R, uniquely determined by the initial 
data {u(0), u; (0)}, depends only on the values of u(0) and u;(0) and their space 
derivatives, at the intersection of the light cone with the initial plane, that is, the 
points x satisfying |x — y| = |s|. 

Let f = {f). fo} be data that are zero for |x| > S. We claim that for T > S, 


_ U(-T)f is incoming. 


Exercise 3. Show, using Huygens’s principle, that U(—T) f is incoming. 


We show now that the space F of incoming data has all three properties listed in 
(15a) to (15c): 


(i) Let u denote an incoming solution, f its initial data. The solution whose 
initial data are U(r) f is u“) (x, t) = u(x, t +r). Since u is incoming, 


u(x,t) =0 for |x| < —t. 
Therefore 
uM) (x,t1)=0  for|x| < -t +r). 


When r < 0, u» is incoming; this shows that U(r) F C F forr <0. 


THE HEISENBERG COMMUTATION RELATION 455 


(ii) The relation above shows that uw") (x, 0) = 0 for 
entiating with respect to t, that u”? (x, 0) = 0 for ||x 
the intersection of U(r) F contains only the zero data. 


x 


< —r and, after differ- 
< —r. It follows that 


(iii) Let u be any of the incoming solutions constructed above, f its initial data. 
It follows from the construction that U(T) f can be taken as arbitrary data 
supported in |x] < T. Clearly, the union of U(T)F as T —> œ is dense 
in H. 


Huygens’s principle is valid in any space of odd dimension, and so is the analysis 
of incoming data. One could deduce the properties of incoming data from explicit 
formulas for solutions of the wave equation in R” x R, but our derivation is more 
illuminating. 

Far more interesting is the case of wave propagation in the exterior of an obstacles, 
discussed in section 35.4. Suppose that the obstacle is contained inside a ball around 
the origin of radius R. We define an incoming solution u(x, t) as vanishing inside 
the cone 


u(x,t) =0 for |x| < =t + R. 


Note that such an incoming solution satisfies both boundary conditions discussed 
in section 35.4. Define again F as the initial data of incoming solutions. Properties 
(15a) and (15b) can be immediately verified as before, but property (15c) lies much 
deeper (see chapter V of the book by Lax and Phillips). We will return to this example 
in chapter 36, section 36.5. 

Another very interesting example is furnished by the automorphic wave equation 
in hyperbolic space. We will take it up in chapter 37. 


35.6 THE HEISENBERG COMMUTATION RELATION 


In quantum mechanics the state of a physical system is a unit vector u |lu | = 1, in 
a Hilbert space H over C associated with the physical system. Each observable is 

“identified with a self-adjoint operator, constructed according to the so-called rules 
of quantization. 


Definition. The expected value of an observable A in state u is defined tobe(u,Au).._ 


Here we assume that u belongs to the domain of A. 


The term “expected value” implies an uncertainty in the measurement of the ob- ~ 
servable in state x: 


Definition. The uncertainty in the measurement of an observable A in state u is the 
square root of the expected value in the state u of (A — al)”, where a is the expected 
value of A. We denote this uncertainty by A(A, w): 


456 GROUPS OF UNITARY OPERATORS 


A2(A, u) = (u, (A— a)?u) = Au — aul? 
= Aul? — 2a (u, Au) +a? = ||Aul*—a?. © 7) 


‘The third formula shows that the uncertainty A(A, u) equals zero only if Au — au = 
0, namely if u is an eigenstate of A. 


Let A and B denote a pair of observables. According to (27), both can be measuréd ~ 
` ‘with absolute cértainty in the same state u iff u is ‘an eigenvector of both A and B. 
Pairs of commuting operators have common eigenvectors, but in general, this is not 

to be expected. 

. „Suppose that the pair of self-adjoint operators A and B satisfy the Heisenberg 
commutation relation = cas ae aa 


AB — BA = il. (28) 


A and B have no common eigenvector, for such a vector would be mapped into zero 
by the left side of (28). This shows that such observables A and B cannot both be 
measured with absolute certainty. Heisenberg has made this uncertainty quantitative: 


Theorem 10. Suppose a pair of self-adjoint operators A and B satisfy the Heisen- 
berg commutation relation (28). Then in any state u that belongs to the domain of 
both A and B the uncertainties in the measurement of A and B satisfy the inequality 


A(A, u)A(B, u) > 5. (29) 
(29) is called the Heisenberg uncertainty principle. 


Proof. We start by reformulating the commutation relation. Let (28) act on a vec- 
tor u, and form the scalar product with u. Operating formally, using the symmetry of 
the operators A and B, we get 


(Bu, Au) — (Au, Bu) = i lju i2. (28°) 


We require (28’) to hold for all vectors in the domain of both A and B. This is a 
kind of weak interpretation of (28). Let 1 denote any real number. According to the 
Schwarz inequality, for any unit vector u that belongs to the domain of both A and B, 


|u, Au + itBu)|* < Au + itBull? . (30) 


Denote the expected value of A and B by a and b, respectively; then inequality (30) 
can be written as 


a? + bt < Aull? + i [(Bu, Au) — (Au, Bu)]t + ||Bul[22?. (30') 


Using (28’), we see that the middle term on the right in (30’) equals —2. Using this 
and the notation introduced in (27), we can by subtracting the left side in (30’) from 


THE HEISENBERG COMMUTATION RELATION 457 


the right side rewrite (30°) as 
< (Aul? — a?) = £ + (Bul? — 6?) 2? = a2(A, u) = t + 0208, wy? 


In words, the quadratic polynomial on the right is nonnegative for all real ¢. Therefore 
its discriminant is nonpositive: 


1 —4A7(A, u) A? (B, u) < 0; 
this is inequality (29). g 


What pair of operators satisfy the commutation relations (28)? An elegant argu- 
ment of Wielandt shows that no bounded operators do. To see this, we deduce by 
induction from (28) that for all natural numbers n, 


i nB”! = AB" —B"A. (28”) 


Taking the norm of both sides and using the triangle and product inequalities on the 
right gives 


alB"' || < 2A] |B" < 21A] IB] IB =", 


which implies that |B’~'|| = 0 forn > 2|Al] IBI; so B’-! = O. Backward 
recursion based on (28”) shows that then B“ = O forall k. 

On the other hand, the operators A = i(d/du),B = p, acting on the Hilbert 
space L? (R) of functions f(u) do satisfy (28), for 


care ae if. 
H 


r du 


It was shown by von Neumann that this pair is, except for multiplicity and unitary 
equivalence, the-only pair of operators satisfying the commutation relation. Before 


“ stating-precisely and proving this; we follow Weyl in reformulating (28). Consider 


U(s)BU(—s), (31) 


where ee is the Unitary group generated by 7A’ On the domain of A, U(s) satisfies” 


Zug) = iİAU(s) = iU(s)A. - 
Differentiating (31) formally and using the commutation relation (28), we get 


<UG)BU(-s) = iU(s) [AB — BA] U(—s) = — 


ie 


458 GROUPS OF UNITARY OPERATORS 


Integrating this relation.gives 


U(s)BU(—s) = B — si, (32) 


called the Weyl form of the commutation relation. It is taken to mean that for all real 


values of s the self-adjoint operators on the two sides of (32) are identical. 


Exercise 4. Deduce from (32) that U(s) maps the domain of B onto itself. 


Exercise 5. Denote by V(t) the unitary group generated by iB..Deduce from (32) 
that for all real s and fr, 


Saas, 7 we U(S)VE) =e V(tyU(s). 
(Hint: Differentiate with respect to t.) 
The next result is due to von Neumann: 
Theorem 11. Let A and B a pair of self-adjoint operators acting in a Hilbert space 


H, U(t) the unitary group of operators generated by iA. Suppose that the Weyl rela- 
tion (32) is satisfied. Then there is a representation of H as L?(N, R) so that 


A=i—, 
ae spent’ N "Tu 


w 
I 
z 


Proof. We remarked earlier that Sinai derived the translation representation theo- 
rem from von Neumann’s theorem. Since we have given an independent proof of the 
translation representation theorem, we are entitled to reverse Sinai’s proof. 

Let E() be the spectral resolution for the self-adjoint operator B: 


B= J AdE(2). (33) 
Then 
U(SYBU-5) = | 24 CUGEQU-S) (34) 
and 
B-—sl= fe — s)dE(A) = [raza +s). (34’) 


The operators on the left of (34) and (34’) are self-adjoint; the integrals on the right 
give their spectral resolution. Since, according to (32), the two operators are identi- 
cal, so are their spectral resolutions: for every Borel set T, 


U(s)E(T)U(—s) = E(T +5). (35) 


THE HEISENBERG COMMUTATION RELATION 459 


Denote by F the range of E(R_). We claim that the group U(s) and the subspace F 
behave as indicated in (15). Setting T = R_ in (35), we conclude that U(s) F is the 
range of E(IR_ +s). According to spectral theory, these form a one-parameter family 
of subspaces increasing with s, going from {0} to H as s goes from —oo to oo. These 
are the properties of F stated in (15a) to (15c). 

„We appeal now to theorem 7, according to which H has a representation as 
L?(N, R) in which U(s) acts as translation, and F = L?(N, R ). This shows that 
the generator iA of U is represented as —d/dj.. It further follows that the range 
of E(R- + s) is U(s)F = L2(N,R_ + s). Thus E(R- + s) is multiplication 
by the characteristic function of R- + s. Setting this into (33), we see that B is 
multiplication by p. o 


PHILOSOPHICAL-HISTORICAL NOTE. The uncertainty principle is one of those no- 
tions of mathematical physics that have profoundly changed philosophical thinking. 
Other examples are quantum jump, the special theory of relativity, Gédel’s incom- 
pleteness theorem, and maybe black holes. It has even entered public consciousness. 
An example occurs in Michael Frayn’s play Copenhagen, triumphantly presented in 
London and on Broadway. The play revolves around a visit that Heisenberg payed 
to Bohr in Copenhagen on September 21, 1941, the high water mark of Germany’s 
conquests. Heisenberg claimed that he came with a vague proposal that scientists on 
neither side should make an effort to build a nuclear bomb. Bohr’s recollection was 
that Heisenberg came to gather information; he denied that Heisenberg’s account 
had “any basis in the actual events.” The playwright suggests that perceiving the 
same event differently is a manifestation of a kind of uncertainty principle in human 
communication. 

Arnold Kramish, physicist and historian of the nuclear age, has evidence that the 
visit to Copenhagen was an intelligence gathering mission, no uncertainty about it. It 
was triggered by an article in 1941 in the Swedish newspaper Stockholms-Tidningen 
describing an effort in the United States to build a new type of bomb out of uranium, 
with unprecedented explosive power. This article was picked up by Dr. P. K. Schmidt, 
head of the Press Branch of the German Foreign Office. Schmidt forwarded the report 
to the physicist Carl von Weizsäcker, son of the German Foreign Minister Ernst 


von Weizsacker;-who- on-September-4,-1941,-informed-the-Abwehr, the Intelligence--- 


Branch of the German High Command, and Bernhard Riist, Reichsminister in charge 
of the ongoing German uranium project, of which Heisenberg and Weizsäcker were 
leading members. A “cultural” visit of Heisenberg and Weizsäcker to Copenhagen 


two-weeks_later-was-arranged-at the highest level-— anes ese 
—=+->=~-[ tis-ironical-that the report in-the-S wedish newspaper was premature; the American 


uranium project did not start until 1942. By an ever greater irony of fate, Heisenberg 
learned nothing on this trip from Bohr, but Bohr found out that the Germans had 
an active uranium project. In 1943, when Bohr escaped to America, he warned the 
leaders of the Manhattan Project of the danger of a possible German nuclear bomb. 


460 GROUPS OF UNITARY OPERATORS 
_ BIBLIOGRAPHY 


Hopf, E. Ergodentheorie. Ergebnisse der Math., 2, Springer, Berlin, 1937. 
- Heisenberg, W. Z. Phys., 43 (1927). 


Koopman, B. O. Hamiltonian systems and transformations in Hilbert space. Proc. Nat. Acad. Sci. USA, 
-47 (1931): 315-318. ` ` 


f Lax, P. D- The ergodic character of sequences of pedal triangles. Am. Math. Monthly, 97 (1990): 377-381. 


_..... ax, P. D. and Phillips, R. S. Scattering Theory. Pure and Applied Mathematics, 26, Academic Press, New 
York, 1967. : 


Lax, P. D. and Phillips, R. S. The Translation Representation Theorem. Integral Equations and Operator ` 
Theory 4. Birkhauser, Boston. 1981, pp. 416-421. 


—-vor-Neumann;J-Die Eindentigkeit der Schrédingerschen Operatoren. Math. An., 104 (1931). 
von Neumann, J. Proof of the quasi-ergodic hypothesis. Proc. Nat. Acad. Sci. USA, 18 ( 1932): 70-82. 


Schwartz, J. The pernicious influence of mathematics on science. Logic, Methodology and Philosophy 
of Science. Proc. 1960 Int. Congr. E. Nagel, P. Suppes, and Tarski, eds. Stanford University Press, 
Stanford, 1962, pp. 356-360. 


Sinai, Ja. G, Dynamical systems with countable Lebesgue spectrum. /zv. Akad. Nauk SSSR, 25 (1961): 
899-924. : 


Stone, M. Linear transformations in Hilbert space, IV. Proc. Nat. Acad. Sci. USA, 15 (1929): 198-200. 
Weyl, H. Quantenmechanic und Gruppentheorie. Z. Phys., 46 (1927): 1-46. 
Wielandt, H. Uber die Unbeschrinkheit der Operatoren der Quantenmechanic. Math. An., 121 (1949): 21. 


EXAMPLES OF STRONGLY 
CONTINUOUS SEMIGROUPS 


36.1 SEMIGROUPS DEFINED BY PARABOLIC EQUATIONS 
In chapter 16, section 16.5, we studied solutions of the heat equation 
Uy = lyx (1) 


that are defined for allt > 0 and all x, and that decay sufficiently rapidly as |x| > 00. 
We have shown there, see theorem 13, that such solutions are uniquely determined 
for all ż by their initial values. The solution operators S(t) relate initial values of 
solutions to their values at time f: 


S(f)u(0) = u(t). 


The results of section 16.5 can be summarized as follows in semigroup language: 
The solution operators S(t) form a strongly continuous semigroup of contraction 
operators in any of the LP (R) spaces, 1 < p <-00, and on the space of continuous 
Junctions on R that are 0 at +00. 
It was remarked already in chapter 16 that similar results hold for a far more 
general class of equations than the heat’equation: we may replace in (1) the second 


space derivative of by any second-order elliptic Ree in any number of space 
variables E acting on u: n DOE 


uy = Eu, . (2) 


—_= ð 
E = Xaj ði j- bi Oje-+ cy sone Oj a [ae ant (25---- 
l 


where (a;j) isa a real, uniformly positive definite symmetric matrix whose entries are 
smooth functions of x. The coefficients b; and c are also smooth functions of x. The 
whole space may be replaced by a bounded domain in R”, on whose boundary u 
is required to satisfy a single boundary condition; say u = 0. The solution oper- 
ators again form a semigroup. The domain of the infinitesimal generator includes 


461 


462 EXAMPLES OF STRONGLY CONTINUOUS SEMIGROUPS 


all smooth functions that satisfy the boundary condition, and on such functions the 
generator acts as the operator E defined in (2’). A possible approach to proving the 
existence of solutions of (2) with prescribed initial values is to make the proper ex- 


_ tension of the operator E and then verify that E thus extended satisfies the hypotheses - 


of the Hille-Yosida theorem. 


36.2 SEMIGROUPS DEFINED BY ELLIPTIC EQUATIONS 


In the first example we take the Banach space C(S™), the space of continuous func- 
tions u on the m dimensional unit sphere. For each such function u there is a uniquely 
determined harmonic function h = h(rw) in the (m+ 1)-dimensional unit ball which . 


equals u on the boundary of the unit ball: . 
Ah=0, h(w)=u(m), win S”. (3) 
We define the semigroup Z(t) as follows: 
Z(t)u = h(e~'w), (4) 


where h is the harmonic function defined by (3). 
Theorem 1. 


(i) The operators Z(t) are contractions in the maximum norm. 
(ti) Z(t) form a strongly continuous one-parameter semigroup of operators. 
(iii) For t > 0 the operator Z(t) is compact. 


Proof. (i) follows from the maximum principle for harmonic functions. (ii) The | 
semigroup property is clear. Since if h(x) is a harmonic function, so is A (cx) for any 
constant c; therefore Z(t)Z(s)u = h(eeSw) = h(t) w) = Zs + i)u. Strong 
continuity is a consequence of the-continuity of harmonic functions in the unit ball © 
whose boundary values on the unit sphere are continuous. 

(iii) To prove the compactness of the operator Z(t), t > 0, we have to show that 
the image of the unit ball under Z(r) lies in a compact set. This image consists of 
the harmonic functions on the sphere |x| = e~' which are bounded by 1 in the unit 
ball. According to the Arzela-Ascoli criterion (see the beginning of chapter 22) for 
precompactness in the maximum norm a set of functions has to be equicontinuous. It 
is a well-known property of harmonic functions that on any compact subset of their 
domain of definition, their first (or any higher-order) derivatives are bounded by a 
constant multiple of their maximum on their domain of definition. Since uniformly 
bounded first derivatives guarantee equicontinuity, it follows that the image of the 
unit ball in C(S’") under Z(t), t > 0, is precompact. o 


In our second example we replace the maximum norm by the L? (S) nori The 
same results hold, as well as an additional property: 


SEMIGROUPS DEFINED BY ELLIPTIC EQUATIONS 463 


Theorem 1’. 


(i) The operators Z(t) defined by (3) and (4) are contractions in the L?(S™) norm. 
(ii) Z(t) form a strongly continuous semigroup in the L(S") norm. 
(iii) Z(t) is a compact operator for t > 0. 
(iv) The operators Z(t) are real symmetric. 


Proof. (i) We state and prove an L? analogue of the maximum principle. For sim- 
plicity let us take the case m = 1. Every harmonic function h defined in the unit disk 
can be expanded into a Fourier series: 


h = Er" (ay cosnO + by sinn), 


where'r and @ are polar coordinates. By the Parseval relation 
J A? (r,0)d0 = n Er” (a? + b?), (5) 


clearly an increasing function of r. 

This completes the proof of (i). To deduce (ii), we go back to theorem 1, that Z(t)u 
is continuous in the max norm for u continuous; it follows that Z(t)u is continuous 
in the L? norm as well. Since the continuous functions are a dense subspace of L?, 
it follows that Z(r)u is continuous in the L? norm for all u in L?. 

(iii) The proof of compactness of Z(t) for t > 0 can be deduced as before, since 
it is possible to estimate a harmonic function and its derivatives at points of the unit 
ball with r < 1 in terms of r and the square integral of the harmonic function on the 
unit sphere. 

(iv) The symmetry of Z is expressed by 


(Zu, v) = (u, Zv) 


In view of the definition (4) of Z, this means that for any pair of harmonic functions 
h and k in the unit ball, and that any s < 1, 


fres w) k(l, w) dw = [rae k(s, w) dw. (6) 


To prove this, we want to deform continuously the left side of (6) into the right side 


through the one-parameter family: Blends os 


fho oka w) do, g=2, s<psl. (7) 


In the open interval s < p < 1 the integral in (7) depends differentiably on p; its 
derivative with respect to p is 


464 EXAMPLES OF STRONGLY CONTINUOUS SEMIGROUPS 


f [h (p) k(q) — h(p) kr (q) 2] dw, : (8) 
where we have used the fact.that dq/dp-=.—q/-p-Now-define the function-£ by--——----- 
L£(r,w) =k (2r v) G 
p 


£ iş a harmonic function in the ball of radius p/q, and 


&(p.w) =k(g,), (p, o) = Tkl. w). 


~ Setting this into (8), we obtain -~ - Peras er 


J [h (p) £(p) — hA(p)t(p)] dw. (9) 
We recall now Green’s formula: 
i [Ah — h Af] dx = f [HEhn — hlnlds. (10) 
G aG 


Since h and £ are harmonic functions, the left side of (10) is zero; therefore so is the 
right side. Now take G to be the ball of radius p; the normal derivatives h, and £, 
are then derivatives with respect to the radius r, and dS = p™ dw. The right side of 
(10) is p™ times (9). This proves that (9) is zero; therefore (8) is independent of p. 
Letting p tend to s and to 1, we deduce that the two sides of (6) are equal. o 


We show next that the generator G of Z(t) defined in theorem 1’ is self-adjoint. 
To see this, let u and v belong to the domain of G; since Z(t) is symmetric, 


(Z(t)u, v) = (u, Z(t)v). 
Differentiate with respect to t, and set r = 0: 
(Gu, v) = (u, Gv), 


proving that G is symmetric. We saw in chapter 32 that every complex number with 
sufficiently large real part belongs to the resolvent set of G. On the other hand, we 
have shown in chapter 33 that every unbounded symmetric operator whose resolvent 
includes points in both the upper and lower half of the complex plane is self-adjoint. 
Combining these two facts, we conclude that G is self-adjoint. 


Exercise 1. Show that if the generator G of a semigroup Z(t) is self-adjoint, so is 
Z(t). (Hint: Use the functional calculus of self-adjoint operators.) 


What is the spectrum of G? Since Z(t) is compact for t > O, its spectrum is pure 
point spectrum, a discrete set of points accumulating only at zero. It follows from 


EXPONENTIAL DECAY OF SEMIGROUPS 465 


Phillips’ spectral mapping theorem (see chapter 34, section 34.5) that the spectrum 
also of G is discrete and accumulates only at —co. It turns out that the spectrum of 
G can be determined explicitly: 

Let y denote a point of the spectrum of G; since G is self-adjoint, y is real. 
By the aforementioned spectral mapping theorem, ¢”’ belongs to the spectrum of 
Z(t); since Z(t) is a contraction, y is < 0. Let e(w) be a corresponding eigenfunc- 
tion, h(rw) the harmonic function whose boundary value is e(w). By definition (4), 
Z(tye = hle™ w) = eV e(w). Setting e~' = r, we get that 


h(rw) =r elw). (1) 


It follows from (11) that y is a nonpositive integer; otherwise, the function r~” e(w) 
would have only a finite number of derivatives at r = 0, contrary to the fact that 
harmonic functions in the unit ball are infinitely differentiable (even analytic) at the 
origin. 


Theorem 2. The spectrum of the generator G of the semigroup Z(t), defined in the- 
orem 1’, consists of all integers < 0. 


Proof. What remains to show is that every nonnegative integer n belongs to the 
spectrum of G. Consider the space P of homogeneous polynomials p(xg, +++, Xm) 
of degree n; denote its dimension by dy. The Laplace operator maps such a p into 
the space of homogeneous polynomial of degree n — 2. By linear algebra, a subspace 
of P of dimension at least dn — d,—2 is mapped into zero; that subspace consists 
of harmonic polynomials. The boundary value e(w) = p(w) of such a polynomial 
satisfies 


Z(the = ple™ w) =e elw), 


and so is an eigenfunction of Z(t). Differentiation with respect to t shows that e is 
an eigenfunction of G with eigenvalue —n. g 


36.3 EXPONENTIAL DECAY OF SEMIGROUPS 


- group-grows-at most-exponentially as t —> oo. In this section we investigate how fast 
such a semigroup can decay. 

Suppose that the infinitesimal generator G, acting on a Hilbert space H, is self- 
adjoint and bounded from above. Using the spectral resolution of G we can express 
the semigroup as 


Z(t) f = J e' dE f. 


In-chapter34; section 34-1, we-saw that-a strongly-continuous-one=parameter-semi= ~~~’ 


466 f EXAMPLES OF STRONGLY CONTINUOUS SEMIGROUPS 


It follows from this formula that for f Æ 0, Z(t) decreases at some exponential rate 
as t tends to oo. The main result of this section (see Lax) includes a perturbation of 
this result. 


Theorem 3. Let H be a Hilbert space, G the generator of a strongly continuous 
semigroup. Assume that there is an infinite sequence {$n} of real numbers, tending to 

`=, such that the resolvent of G is uniformly bounded by some constant d~! on all 
lines Red = Ey in the complex plane: 


IG-Aal}<d7!, Rea = £n. (12) 


© wen --Let-u(t), 0 < t, be a vector-valued function whose values lie in the domain of G. 
Assume that Gu(t) is a strongly continuous function of t, and that u is strongly 
differentiable, and its derivative u; is strongly continuous. We require u(t) to satisfy 
for all t the inequality 


lu: — Gul] < k 


ull, 0<t, (13) 
where k is some constant less than d. Then, unless identically zero, u(t) decays no 


faster than exponentially in the L? sense, that is there is a positive number b such 
that 


f "boria, (14) 


Proof. The proof is based on an inequality contained in 


Lemma 4. Let G be the generator of a strongly continuous semigroup, whose resol- 
vent is bounded by d~! for Rex = E. Let u(t), —œ0 < t < 00, be a vector-valued 
function whose values lie in the domain of G. We assume that Gu(t) is a strongly 
continuous function of t, and that u(t) has a strongly continuous first derivative with 
respect to t. Then 


e f lu) e728 dt < f |Gu — u;,||2e775" dt, (15) 
R R 
provided that the integral on the left is finite. 

Proof. Define v(t) = e~§' u(t); then (Gu — u;)e7§' = (G — &)v — v. Its Fourier 
transform is (G — £ — it)d(t), where 6 is the Fourier transform of v. Since, by 
assumption, ||(G ~A)~!|| < d7! forReA = £, 

Qc)? < G= ity). (16) 


According to Parseval’s theorem, the Fourier transform preserves the L?-norm. Inte- 
grating (16) we obtain (15), provided that v is in L2. m 


EXPONENTIAL DECAY OF SEMIGROUPS 467 


Exercise 2. Prove Parseval’s theorem for functions whose values lie in a Hilbert 
space. 


We turn now to theorem 3. First we extend u(t) for negative values of f, such as 
by setting u(t) = u(0)a(t) for t < o, where a(t) is a smooth function of compact 
support, a(0) = 1. We apply inequality (15) to u thus extended, with € = En where 
(12) is satisfied. On the right, we apply for £ > 0 inequality (13), and on the left, we 
drop the integral over R_. We get 


Pf MOP dt = Ky +f aoe ar, (17) 
R+ 


where Kp is the integral of ||Gu — u;||2e725"! over R_, and therefore tends to zero 
as n tends to co. Since k is less than d, we deduce that 


[twee at < Ka- e), a7) 
Ry 


provided that the integral on the left is finite. If, contrary to (14), the left side i is finite 
for all &,, it follows from (17’) that w(t) = 0 for allt > 0. o 


Note that in the proof we only required the resolvent of G — i to be bounded as 
in (12), and not that it be defined everywhere. 

Note that for G self-adjoint, condition (12) is equivalent to this: 

The intervals (En — d, En + d) are free of the spectrum of G. 

So as a corollary of theorem 3 we get 


Theorem 3’. Let G be a self-adjoint operator, bounded from above, whose spectrum 
has infinitely many gaps of width 2d. Let u(t) be a vector-valued function as in 
theorem 3, satisfying inequality (13); then, unless identically zero, u(t) decays no 
faster than exponentially, in the sense of (14). 


The restriction k < d is sharp; an example to the contrary-when. k-<-d-is. given-in—---— 
Lax. 

Here is an application of theorem 3’. Let L be a partial differential operator of the 
form L = A —c, where A is the Laplace operator and c is the operator of multiplying 
-by-the-function-c(),-assumed_to-be-smooth-and-positive-The-L2-analogue-of-the -—— 
maximum principle. (5).holds.for-solutions-of-Lw-s=.-Qs——w--——~ -ir nmr ee 


= Lemma 5. Let w be a solution of Lw=0,L=A-cc positve. Then 


is an increasing function of r. 


& 


Mi 
¥ 


468 EXAMPLES OF STRONGLY CONTINUOUS SEMIGROUPS 


Proof. Differentiate (18) with respect to r; we get 


d 
—]I(r)= af wrw dw. 3 
dr . . eere 
Next we use Green’s theorem to transform the following integral over the ball B, of 
radius r: 
0= wLwdx = (waw = cw?) dx = | wwr dw — r (wy bow") dx. 
ene eae mem oe B,. V-B; + S 


Since c is a positive function, the positivity of (18’) follows, and so does the lemma. 

i o 

We designate by H the Hilbert space L?(S), S, the unit sphere. We take G to be 

the generator of the semigroup Z(t) discussed in theorem 1’ of section 36.2. Recall 
that Z(t) is defined by equation (4): 


Z(thu = hle w), (4) 
where h is the harmonic function defined in (3): 
Ah =0,h(w) = u(w), win S. | ` (3) 
Differentiating (4) at £ = 0 gives 
Gu = =h. . (19) 


Let w be a solution of Lw = 0 in an open set containing the unit ball. Define u(r) 
to be w(e~'w). Then at t = 0 


a g Ur = — Wr. (20) 


We now show that this function u satisfies inequality (13) of theorem 3, with k= = =k (t) 
a function that tends to zero’as t tends to 60. 

Let v be any smooth function on S; define p(x) to be the harmonic function in the 
unit ball whose value on S is v: 


Ap=0, p(w) = vw). (21) 


Multiply (21) by A and integrate over the unit ball. Since h and p are both harmonic 
functions, Green’s theorem applied on the unit ball B gives 


0= [ron — phy) dw. (22) 
S 


A similar application of Green’s theorem gives 


f pAw — (Ap)w = [row — prw) dw. 
B S 


EXPONENTIAL DECAY OF SEMIGROUPS l 469 


Using the fact that Ap = 0 and Aw = cw, we get 


[pew = fow — prw) dw. 


Adding this to (22) gives 
J p(w, — hy) + pr(h — w) dw = pew dx. (23) 
B 
Att = 0, u(w) = w(w), and by (3), h(w) = w(w), According to (21), p(w) = v(w). 


Furthermore, according to (19), 4, = —Gu, and by (20), w, = ~ur. Setting these 
into (23) gives 


(v, Gu — uy) =}, pewdx, (23') 
B 


where (, ) is the scalar product in H. By the Schwarz inequality the right side of (23^) 


is less than 
1/2 
F ? 
Cmax a po ax fw? ax) ; (24) 
B 


It was shown in section 36.2 that for a harmonic function p, f Prw) dw is an 
increasing function of r. It follows that 


1 1 
2 as 2 M ogy Fordo = 2 
fye a ffe aR ITE m+1 f loxma mer , 


where |||] is the norm in H. Similarly, using lemma 5, we can estimate 


l 
f! wdx < Jul. 
B m+l 


These estimates show that (24) is bounded by cmax1/ (m + 1)ljv]lllul|. So we deduce 
from (23’) that 


i 
Cmax lulle T (25) 


ere aaa. | (V, Gu up) fe<. 


m+ 


According to corollary 1’ of chapter 6, the norm of every element f of the real Hilbert < 


space H can be characterized as follows: 


fill = sup, f), 


where v ranges over a set of unit vector, ||v|] = 1, that are dense in the unit sphere 
of H. So we conclude from (25) that for t = 0, 


470 A i EXAMPLES OF STRONGLY CONTINUOUS SEMIGROUPS 
ne 1 . 
|| Gut — wt; | < —— cmaxllu |l. (26) 
m+] 


This is inequality (13), with k = 1/(m + 1)cmax. l Fe a eth ees 
Recall that w(t) in H is defined as the value on the unit sphere of w(e'x). The 
function w(t) = w(e'x) satisfies the differential equation 


Aw(t) =e" (Aw)(e“'x) = e7'c(e“'x) w(t). 
Therefore applying (26) to u(t), we get” 


e72! 


[Gu — u;|| < Cmax llul]. (26°) 


ue EA m+l 
This is inequality (13), with k(t) = e7? cmax/(m +1). 
According to theorem 2 the spectrum of G lies on the negative integers; therefore 
it contains infinitely many gaps of length 1. The factor on the right in inequality (26) 
tends to zero as t tends to oo. Therefore we can apply theorem 3’ to u(t) = w(e™! w) 
and conclude, as in (14), that unless w = 0, there is a constant b such that 


99 9 
[ lutt)? e” dt = œ. (14) 
0 
Denote e™ = r, and x = rw. Since dt = —rdr, and dx = r” dwdr, we can 
- rewrite (14) as follows: : 
f w(x) 2+ dx = oo. (14’) 


We can restate our conclusion (14’) as follows: 


Theorem 6. A solution w of a partial differential equation Aw + cw = 0 cannot 
have a zero of infinite order unless w = 0. 


NOTE. It is well known (to specialists in partial differential equations) that solutions 
of linear elliptic equations with analytic coefficients are themselves analytic. Such 
solutions cannot have zeros of infinite order. So the result derived above is novel 
only in the case when the function c(x) is not analytic. The first such theorem was 
obtained by Carleman in two space variables, and by Müller for any m; a very general 
result is due to Calderon. 


36.4 THE LAX-PHILLIPS SEMIGROUP 


In chapter 35, section 35.5, we introduced the concept of a translation representation 
of a group of unitary operators U(r) acting on a Hilbert space H. A key role there was 
played by a subspace F, which we will here call an incoming subspace and denote 
as F_. The incoming subspace, F_, is assumed to have the properties enumerated in 


THE LAX-PHILLIPS SEMIGROUP 471 


equation (15) in chapter 35: 


U(r) FL c F- forr <0, (27a) 

NU(r) F_ = {0}, (27b) 

UU(r) F_ = H. (27c) 

We also require the existence of an outgoing subspace F4, satisfying analogous con- 
ditions: 

Ur) Fi C Fy forr > 0, (28a) 

NU(r) Fs = {0}, (28b) 

UU(r) Fy. = H. (28c) 


Furthermore we require F and F+ to be orthogonal to each other. 


Theorem 7. Let U(t) be a strongly continuous one-parameter group of unitary op- 
erators acting on a Hilbert space H, F_, and F4 a pair of incoming and outgoing 
subspaces in the sense of properties (27) and (28), orthogonal to each other. De- 
note by P_ orthogonal projection onto the orthogonal complement of F, and by 
P orthogonal projection onto the complement of F. Denote by K the orthogonal 
complement in H of F_ ® F4. Then 


Z(t) = P UHP, 120, (29) 


is a strongly continuous semigroup of contractions on K that tends strongly to zero 
as t tends to 00. 


Proof. Clearly, each Z(t) is a contraction. To show that they map K into K, we 
have to demonstrate that for k, Z(t)k is orthogonal to both Fy. and F~. Since Px is 
projection onto the orthogonal complement of F+, it follows from (29) that the range 
of Z(t) is orthogonal to F4. f 

--— Since P--is-the-identity on K;-Z(t)k =-PU(t)k. We claim that for t > 0, U(t)k 
is orthogonal to F—. To see this, take any f- in F_ and write 


(Uk, f-) = (k-, U*(t) f-) = (k-, U(t) f-). (30) 


~=- ==According to-(27), fort > 0; U(—t) maps F_ into F_, and therefore that scalar 
product on the right in (30) is zero. Since P+U(¢)k differs from U(r)k by a vector in 
F, assumed to be orthogonal to F_, it follows that P} U(z)k, too, is orthogonal to 
F. 
Next we show that Z(t) form a semigroup. According to properties (28) for t > 0, 
U(t) maps F+ into F+. Since P+ removes the Fi component, it follows that 


PiU(t)P+ = PU(s), t>0. 


ie 


472 EXAMPLES OF STRONGLY CONTINUOUS SEMIGROUPS 


Thus we can write fork in K that 


Z(t)Z(s)k = Py U(t)P,US)k =P UOU = PLUG Ek = ZG tsk. 


To show that Z(t) tends strongly to zero, we use property (28c) that the union of 
U(r) F4 is dense in H. Therefore given any k in K, and any e > O, there is a vector 


f+ in Fẹ, and r such that U(r) f+ differs by less than £ from k. Since P+U(r) isa 


contraction, l 


IP+U(t)(k = U(r) FI < e. 


But by (28c) forr+r > 0, P,U(t+r) f+ is zero; therefore fort > —r, ||P_U(1)k|| = 
|Z(t)k\| < £. i eas gaara Sap E SE Bea a a | 


In the next section we will present an example that lends substance and interest to 
the abstract theory described in this chapter. An equally interesting example will be 
given in chapter 37, section 37.9. 


36.5 THE WAVE EQUATION IN THE EXTERIOR OF AN OBSTACLE 


Let B be a smoothly bounded domain, the obstacle, in R3, contained in the ball 
of radius R, R some positive number. At the end of section 35.4 of the preceding 
chapter we have looked at solutions of the wave equation 


Uy, — Au =0, 


defined for all x in the exterior of B and for all times, and which are zero on the 
boundary of B. We have shown there that such solutions conserve energy, that is, 


1 
-E = A fe + uŻydx (31) 


is independent of time, where the integration is over all points x in the exterior of B. 
The square root of energy is the energy norm; we denote by H the completion in the 
energy norm of all initial data {u (0), u; (0)} that are smooth, of compact support in 
the exterior of B, and are zero on the boundary of B. We denote the energy norm of 
the initial data f = {f}, f2} as || fle. 

As already remarked in section 35.4, using techniques in the theory of partial dif- 
ferential equations one can show that there exist for all times, positive and negative, 
solutions of the wave equation in the exterior of B with prescribed smooth initial 
data with finite energy that are zero on the boundary of B. Conservation of energy 
shows that these solutions are uniquely determined by their initial data. So we may 
speak of the operator U(r) that maps initial data into data at time r: 


Ut) : {u(o), u, (0)} —> (u(t), u: (t)}. 


THE WAVE EQUATION IN THE EXTERIOR OF AN OBSTACLE 473 


We can extend, by continuity, these operators U(r) to all of H. The operators thus 
extended form a strongly continuous one-parameter group of unitary operators. 

We recall now from the end of section 35.5 the notion of an incoming solution 
u(x,t) of the wave equation in the exterior of an obstacle as one that is zero in the 
backward cone: 


u(x,t) =0 for |x| < —t + R;t <0. 


The closure in the energy norm of initial data of such solutions is defined as the 
incoming subspace F_ for our group of unitary operators. Properties (27a) and (27b) 
in are easily verified. Property (27c) lies much deeper. A full proof is given in Lax 
and Phillips; here we will merely point out its close connection with local energy 
decay. We shall show that for every bounded subset G of the exterior of the obstacle, 
and for every f in H, 


lim U(t)flle.¢ = O, (32) 
t+—00 


where || || £ g denotes the local energy norm: 


1 
2 _1 BS iON oe 
laiko = 5 f, (vee +3) dx. 


Property (27c) implies that given any £ > 0 there exist a number T and g in F_ such 
that 


If -UM)gll <e. 


By definition of F_, for T +f negative U(T + t)g is zero in the ball |x] < R — 
T +t. Fort large enough negative this ball contains G. Since U(t) preserves energy, 
Oe) f —-Ue+T)glle = lf —UM)glle < e; it follows that for t large enough 
negative ||U(t) f|llze.q < e. This proves (32). 


~ The importance of local energy decay is that also the inverse implication holds. 
In fact property (27c) follows from a weaker form of energy decay: 


lim inf UŒ) flle,g = 0. 


For a derivation of property (27c) from (32), and a proof of (32) we refer to chapter V 
of Lax and Phillips. We remark that the key step in the derivation of (32) is to show 
that the generator of the group U(t) has no point spectrum. 

In a similar fashion we define outgoing solutions v as those that are zero in the 
forward cone: 


v(x, t) =0 for |x| <t+R,t>0. 


474 EXAMPLES OF STRONGLY CONTINUOUS SEMIGROUPS 


The closure-in the-energy norm of the initial data of outgoing solutions forms the 
outgoing subspace F4. As before, properties (28a) and (28b) are easily verified, _ 
mee propery (aB) is hard work. 


Lemma 8. F_ and F} are ‘orthogonal to each other. - 


Proof. We mimic the construction of incoming data as described at the end of _ 


section 35.5, chapter 35: choose any number T greater than R. It follows from Huy- 


~ gens’s principle that a solution u(x, t) in free space R? x R whose data at time T are 


zero for |x| > T — Ris incoming for t < 0. 
It follows from the conservation of energy that if u and v are two solutions of the 
wave equation in free space for 0 <7 < T, then the energy scalar product 


(u(t), u (t)}, (V(t), vE 


of two solutions of the wave equation is independent of ż. In particular 


({u(o), ur(0)}, {v(0), (oe = (u(T), u(T)}, (WT), (Te. 83) 


For u constructed as above, and v any outgoing solution, the right side of (33) is 
zero, since the functions u(T), u;(T) are supported inside the ball of radius T — R, 
whereas an outgoing solution v is zero there. 

The argument above makes plausible but doesn’t quite prove lemma 8, since we 
haven’t shown that the incoming solutions u constructed above with the aid of Huy- 
gens’s principle are dense among all incoming solutions. I assure the reader that a 
straightforward proof of lemma 8 can be found in Lax and Phillips. 


The use of Huygens’s principle to show the orthogonality of incoming and out- 
going data is quite natural, for in two space dimensions; where Huygens’s Panrl 
fails, incoming and outgoing data are not orthogonal. 

The semigroup Z(r)-is a natural tool for studying the interaction of waves with 
the obstacle. Incoming waves have not yet interacted with the obstacle, and outgoing 
waves never will. Their removal by the projections P_ and P4 clears the way for the 
study of waves that interact all the time with the obstacle. 

There is an intimate relation between the shape of the obstacle and the spectrum 
of the operators Z(t) and their infinitesmal generator G. We start with a preliminary — 
result: 


Theorem 9. Let k be any positive number. The operator (kI — G)~!Z(2R) is com- 
pact. 


For proof we refer to chapter V of the Lax and Phillips book. A fairly immediate 
consequence of theorem 9 is 


Corollary 9’. G has a pure point spectrum accumulating only at oo. 


BIBLIOGRAPHY 475 


Exercise 3. Deduce corollary 9’ from theorem 9. (Hint: Use the spectral theory of 
semigroups described in chapter 34, section 34.5.) 


The link between the geometry of the obstacle and the spectrum of Z(t) is the 
geometrical optics description of wave propogation. The relevant geometric property 
is how long the obstacle can retain a ray, defined as a path consisting of straight line 
segment, reflected at the boundary of the obstacle according to the classical laws 
of reflection. Denote by £(B) the supremum of the length of reflected rays that are 
contained within the ball of radius R. 


Theorem 10. 


(i) If 2(B) < œ, then Z(t) is compact for t large enough. 
(ii) f2(B) = co, IZ) = l forallt. 


The proof of this theorem is too technical to be included here; a few remarks 
will have to suffice. Lax and Phillips have pointed out that part (i) would follow 
from a generalized Huygens principle, which states, roughly, that in the exterior of 
an obstacle the sharp part of signals propagates along rays, including rays reflected 
from the obstacle. This See Huygens principle was proved be Melrose and 
Taylor. 

Combining (i) with local energy decay, we conclude that all eigenvalues of Z(t) 
are of the form e”',0 > Rey; = --- — —ov. It follows that ||Z(r)|| decays expo- 
nentially as t —> oo. Lax, Morawetz, and Phillips have shown that for star-shaped 
obstacles (for which £(B) < 2R), |Z()|lz < e7%, a > 0. Their proof is based on 
the nonstandard energy estimates of Morawetz, and does not rely on the generalized 
Huygens principle. 

Part (ii) is based on the notion that the wave equation has solutions whose energy 
is contained in an arbitrarily preassigned neighborhood of any given reflected aay 
Such solutions were constructed by Jim Ralston. 

When £(B) = oo, the real parts of the eigenvalues of the generator G do not tend 
to —oo. Interesting information about their location was obtained by Ikawa. 


x 


BIBLIOGRAPHY 


indépendentes. Arkiv. Math., 26B, 17 (1939): 1-9... aaa aaa 


Ikawa, M. Decay of solutions of the wave equation in the exterior of two convex obstacles. Osaka J. Math. 
19 (1982): 459-509. 


Lax, P. D. A stability theorem for solutions of abstract differential equations, and its application tc to the 
study of local behavior of solutions of elliptic equations. CPAM, 9 (1956): 747-766. 


Lax, P. D. and Phillips, R. S. Scattering Theory, Academic Press, New York, 1967. 


& 


476 EXAMPLES OF STRONGLY CONTINUOUS SEMIGROUPS 


‘Lax, P. D., Morawetz, C. S., and Phillips, R.S. Exponential decay of solutions of the wave equation in the 
exterior of a star-shaped obstacle. CPAM, 16 (1963): 477-486. 


Melrose, R. Singularities and energy decay in acoustic scattering. Duke Math. J., 46 (1979): 43-59, 


Morawetz, C. S. The decay of solutions of the exterior initial-boundary value problem se me wave 
equation. CPAM, 14 (1961): 561-568. 


Miiller, C. On the behaviour of the solutions of the differential equation Au = F(x, u) in the neighbor- 
hood of a point. CPAM, 7 (1954): 505-515. 


. Ralston, J, Solution of the wave equation with localized energy..CPAM, 22 (1969): 807-823. 


Taylor, M. Propagation, refiection and diffraction of singularities of solutions to wave equations. Bull. 
AMS, 84 (1978): 589-611. 


37 


SCATTERING THEORY 


Two souls dwell in the bosom of scattering theory. One is mathematical, and handles 
the unitary equivalence of operators with continuous spectra. The other is in physics, 
and deals with such notions as quasi-stationary states, cross sections, and what is 
observable in quantum mechanics. 


37.1 PERTURBATION THEORY 


Perturbation theory was developed by Lord Rayleigh, and by Erwin Schrédinger 
as a means of solving problems in physics, classical and quantum mechanical. A 
rigorous theory was set in place by Franz Rellich. In this section we describe the 
simplest results; an extensive discussion can be found in Kato’s magisterial book. 


Theorem 1. Let A be a self-adjoint operator in a Hilbert space H, and œ an iso- 


lated eigenvalue of A of multiplicity one. Let D be a bounded symmetric operator 
IDI] < 1. Then for e small enough A + €D has an isolated eigenvalue of multiplicity 
one infa —€,a+e]. 


Proof. We need 


Lemma 2. Let P and Q be orthogonal projections in a Hilbert space whose differ- 
ence has norm less than 1: 


tis exceed ose AEs MO at gi eatin acta nie goat ie ae AD) 
Then the range of P and Q have the same dimension. 
Proof. If the dimension of the range of P were greater than that of Q, then there 


would be a nonzero vector u in the range of P that is orthogonal to the range of Q. 
For such u, Pu = u, Qu = 0, so ||(P — Q) ul] = llull, contrary to (1). a 


477 


478 SCATTERING THEORY 
Let C be a circle centered at the isolated eigenvalue œ of A, with radius so small 


that C lies in the resolvent set of. A_Let the only-point.of the spectrum of A contained 
__inside the circle C be a. Define the > mapping P by | 


P= f -A)T dt. o (2) 


Clearly, P is the projection EQ), ahere Ei is the. aLaaa projection: A n, 
sure entering the spectral resolution of A. By assumption, the range of P is one 
dimensional. 

For € small enouey ¢—(A+eD) is invertible for all on C,.and its inverse differs 
little from (¢ — -A)=!. So we- may..déi eine he ny eee a 


a ae; 2’) 


Pe is also an orthogonal projection, and ||P — Pe || tends to zero as € tends to zero. It 
follows therefore from lemma 2 that for e small the range of Pe is one dimensional, 
every vector in the range is an eigenvector of A + eD. The corresponding eigenvalue 
tends to œ as é tends to zero. . o 


Exercise 1. Prove that this eigenvalue of A + eD differs from a by < e. 


Corollary. It follows from formula (2') that the eigenvector and eigenvalue of A + 
eD depends analytically on e. 


Exercise 2. Denote the eigenvalue of A +eD by a(e), the corresponding eigenvector 
of unit norm by u(¢). Show that 


d 
ae = (Du, u): 


How about higher derivatives? 


REMARK. Suppose that «œ is an isolated eigenvalue of A of multiplicity n. The ar- 
gument used to prove theorem 1 shows that in this case A + eD has n eigenvalues in 
[æ — €, œ + €], some of them possibly multiple. 

From formula (2') we can conclude that the eigenvalues of A + €D are algebraic 
functions of €, with a possible algebraic singularity at € = 0. Rellich has pointed out 
that an algebraic function can be expanded in a Puiseux series, a power series in €!/?”, 
p the order of the branch point at € = 0; all branches of this power series represent 
solutions of the algebraic equation; that is, they are eigenvalues of A + eD. But for 
p # 1 some of these branches are complex valued for e real; since all the eigenvalues 
of the self-adjoint operator A+D are real, p must be 1. So all eigenvalues of A+€D 
are analytic functions of e€. G 


PERTURBATION THEORY z 479 
We turn now to the other extreme: 


Theorem 3. Let A be a self-adjoint operator in a Hilbert space H, a an essential 
point in its spectrum, that is, for every interval I containing a, E(1) has infinite- 
dimensional range. Let C be any symmetric compact operator; then œ belongs to the 
essential spectrum of A + C. 


Proof. If not, for some interval J containing a, the range of Ec(J) would be finite 
dimensional, where Ec is the spectral resolution of A + C. Adding another compact 
operator to A + C would eliminate the finite number of eigenvalues in J altogether. 
So it suffices to prove that a belongs to the spectrum of A + C. 

To see this, we show that the norm of the resolvent of A + C ata +i 7 tends to co 
as 7 approaches zero. Decompose the compact operator C as 


N 
C=F+S,  whereF= 9) yj(x, fj) fj; ISI <n. (3) 
l 


For every x in the range of E(a — n,a +n), || (A-a— in) x|| < 2n||x||. Since 
the range of E(w — n, œ + 7) is infinite dimensional, we can choose x so that it is 
orthogonal to all the fj, j = 1,..., N. For such an x, Fx = 0. Using this, as well 
as (3), we get 


(A+ C—a—-in) x| = |(A-a@—in) x+Sx|] < ||(A—a—in) xl] +]Sx]] < 3y||xl. 
(4) 


This proves that the norm of (A + C — æ — in)~! is greater than 1 /3n. 


Theorem 3 is slightly misleading. Suppose that the spectrum of A is absolutely 
continuous; every point of it belongs to the essential spectrum of A, and therefore to 
the essential spectrum of A + C, C any symmetric compact operator. But, as Wey! 
observed, the spectrum of A+ C need not be continuous; he showed that there exists, 
for any positive €, a compact operator C, ||C|| < e, such that the spectrum of A+ C 


__is_pure_point spectrum, dense_in the formerly continuous spectrum. This result— 


spectral curdling—was sharpened by von Neumann, who showed that C may be 
taken not only compact, but of Hilbert-Schmidt class (see chapter 30, section 30.8), 
with arbitrary small Hilbert-Schmidt norm. 


For-a-symmetric-compact-operator- €,-with-eigenvalues. {;},-the-Hilbert-Schmidt~ -~ -- 
ee na norm-is-J3-¥7) 


1/2, More generally one can define the p-cross norm as (J. |y;|?) \/p. 


see Shatten. Kuroda has further sharpened von Neumann’s result by showing that C 
may be taken to have arbitrarily small p-cross norm, p > 1. 

For p = 1, the p-cross norm is the trace norm; see chapter 30, section 30.2. For C 
of trace class the story changes dramatically: Marvin Rosenblum has shown, in his 


- dissertation, that perturbing a self-adjoint operator by adding a symmetric operator 


of trace class leaves the continuous spectrum continuous. The next two sections are 
devoted to this result; we follow Kato’s formulation. 


ie 


Ti is es strong Hints of W ) ast ‘nds to +00 that occupy center stage: 


480 SCATTERING THEORY 


37.2 THE WAVE OPERATORS 


~~--—-—-and-B--As-we-observed-in-chapter-35;— ay and-B- generate- one-parameter- ay 


groups, which we denote as e!/4 and e!'B , Define the one-parameter family of unitary 


operators W (r) by 


Riia abie E E E ea a Weyse Be ith. ea ee eee Aai nanter eee 6) raen 


WES =s- jim, Wr), We =s- pees WU); (6) 


provided that they exist. W4 and w- are se catied the wave operators, when we need . 
to underline their dependence on A and B, we will denote them as W-.(B, A). 


Exercise 3. Show that 
W= (C, A) = Wz (C, B) W+ (B, A), 
provided that all the wave operators on the right exists. 


Since the wave operaors are the strong limits of unitary operators, they are iso- 
metric: 


Wull = llull. (7) 
Taking the adjoint of (5) yields 
ae Wir) = el!A pW iB 


Since a strong limit (even weak limit) of the adjoint is the adjoint of the limit of a a 


“sequence of operators, we conclude that” 


W.(A,B) = W3 @, A), (8) 


provided that all these wave operators exist. 
It follows from definition (5) that for every real s, 


Wo +5) = eB w(t) #84, 
Taking the limit as ¢ tends to 00, we get 
W4 = eisB Wi emisa 
and similarly for W_. We can rewrite this as 


Wi eisA = sB W4. (9) 


_ explained in chapter 31, section 31.4. 


THE WAVE OPERATORS 481 


Form the difference quotients 


eisA D eisB si 
U= 


Wi W V, 


s 
where v is a vector in the domain of A. Take the limit s —> 0; on the left we get 
iW. Av. Therefore the limit on the right exists too, so W v lies in the domain of B 
and 


WA =BW}. (9%) 


A similar relation holds for W_. 

The wave operator W maps the Hilbert space H onto a subspace K of H. Rela- 
tion (7) shows that this mapping is unitary. Relation (9) shows that e/5B maps Ķ into 
K; we claim that it maps the orthogonal complement K+ of K into K+ . To see this, 
denote by z any vector orthogonal to K. The relation (wi Zi h) = (z, Wi h) shows 
that K+ is the nullspace of Wi. Using the adjoint of relation (9), with —s in place 
of s, we have 


isA yyt _ py* „isB 
ev" Wi = Wie 


which gives WẸ eiB z  gisA W3 z = 0. This shows that the subspace K reduces 
the operator B, namely that the operator B restricted to K N D and to K+ N D are 
self-adjoint mappings on K and on K+, respectively. Relation (9’) shows that A 
acting on H and B acting on K are unitary equivalent. 

Suppose that not only the wave operator W..(B, A) but also W4 (A, B) exist; then 
it follows from (8) that Wï (B, A) = W(A, B), so by (7) WF $ (B, A) is an isometry. 
It follows that its nullspace is trivial, so K = H. We Ae 


Theorem 4. If both wave operators W+(B, A) and Wi(A, B) exist, then A and B 
are unitarily equivalent. 


Operators that are widiy equivalent have the same spectrum. In particular, they 


have the same point spectrum. Since in scattering theory B is a. perturbation of A, it... 


is highly unlikely that a perturbation would leave the eigenvalue of A unchanged; see 
exercise 2. It is clear that if A has a point eigenvalue, the wave operators W+(A, B) 
would exist only under the most exceptional circumstances. However, this is easily 


ape on a Gino aia eGR aa of HO of H.,-as. 


~ Exercise 4. In chapter 31, section 31.4, we dealt only with bounded operators; show 


how to extend these notions to unbounded operators. 


Definition. The generalized wave operator, also denoted as. W..(B,.A).is defined as 


lim W) Pe = Wz, (6°) 
t= 2o00 


BA 


482 SCATTERING THEORY 


where W(r) is defined by (5), and P is the projection of the Hilbert space H onto Ae 
the subspace H (© on which_A has.an_absolutely continuous spectrum... 


Exercige'5.° Show that if the generalized wave operator W..(B, A) exists, it maps 
H©) onto a subspace K of H that reduces the operator B, and that B has an abso- 
lutely continuous spectrum on K. 


Exercise 6. Suppose that both generalized wave operators W..(B, A) and W(A, B) z 
exist; then the absolutely continuous parts of A and B are unitarily equivalent. 


37.3 EXISTENCE OF THE WAVE OPERATORS 


The following result is due to Cook; see also Jauch and Kuroda: 


Theorem 5. Suppose that there is a dense subset J of H©) such that for alluin J 


(i) eA y belongs to the intersection D(A) N D(B) of the domain of A and B. 
(ii) B — A) e7"A u is a continuous function of t. 
(iii) ||(B — A) e~#4 ul} is integrable up to +00. 


Then the wave operator W (B, A) exists. 
A similar result holds for W. 


Proof. It follows from (i) that for u in J, W(r)u = e!Be-!'Ay is differentiable, 
and that 


“win = ie" (B — Aje "Au. (10) 
By (ii), this derivative is continuous; therefore integration gives 
l W(b)u — W(a)u =i j eB (B— A) e An dt. (10’) 
a 
Since eB is a unitary operator, 


b l 
Wb) u — Wa) ull < i | (B— A) eA ul dt. 


It follows from (iii) that the right side tends to zero as a, b tends to oo. 

This shows that the limit of W(t)u as £ —> --oo exists for all u in J. Since J is 
dense in H, and since the operators W() are norm preserving, it follows that the 
strong limit (6) exists. o 


Lemma 6. Suppose B differs from A by a bounded operator: 
B=A+D, IDI] < œ, 


EXISTENCE OF THE WAVE OPERATORS 483 


and suppose that the generalized wave operator W+.(B, A) exists. Then for every 
vector u in HO 


oo 
|W. u — Wa) ul? = —2Im | 


a 


(en Wi. Deu, u) dt. (11) 


Proof. Since A and B differ by a bounded operator, formula (10) holds for u in 
the domain of A, and so does formula (10’). Since the domain of A is dense in H, 
(10’) holds for all u in H. Let b in formula (10’) tend to +00. By hypothesis, for 
every u in H), W(b) u tends strongly to W4 u, so we get 


co . £- 
Wu- Wou =i f eB De HAY dr. = (10°) 
a 


Since both W (a) and W.. are isometric, 
|W u—W(a)u||7 = 2ljul?—2 Re (W(a)u, Wa u) = 2Re (W4 u—W (a)u, Ww). 


Expressing Wu — W(a) u on the right from (10)”, and using the transpose of 
identity (9), with s replaced by —t, gives (11). ; o 


Theorem 7. (Rosenblum) Suppose that the self-adjoint operators A and B differ by 
an operator D of trace class. Then the generalized wave operators W..(B, A) and 
W4 (A, B) exist. 


As observed in exercise 6, it follows that the absolutely continuous parts of A and 
B are unitarily equivalent. 


Proof. We treat first the case when the perturbation D is of rank one: 


Du=cu, f) f, fll =1. 


We denote by K the smallest closed subspace of H that contains f and reduces 
the operator A. The subspace K can be obtained as the closure of all-vectors of 
the form b(A) f, where b are bounded, continuous functions on R; see chapter 31, 
section 31.5. Note that B = A on the orthogonal complement of K, so for u L K, E 
W(t) is the identity, as are the wave operators. It suffices then to prove the existence 


=—=—-of the wave operators for A-and B-restrictedto k e 


Decompose f as : Set tee m petct es enag enep eee eeepeecnanpaee aie eerie er ag time m e m m t 
PELIN on Ra 


The closure of the set of vectors b(A) g is the subspace K (c) on which A is absolutely 
continuous; A is singular on its orthogonal complement. Since g belongs to H (© the 
measure (Eg, g) is absolutely continuous. So K (©) can be represented by L?(5), S 
some Borel subset of R, and A acts as multiplication by A in this representation. 


484 SCATTERING THEORY 


- With Du defined as c(u, f) f, we take for simplicity c = 1, and define - 


Dou = Pe DPe = (Po u, f) Pe f = (u, 8)8 


g = Pe f is represented by a square integrable function supported on S. Thus the 
existence of the wave operators W4 (B, A), B a rank one perturbation of A, has been 
reduced to the case where the Hilbert space is L?(S), A is multiplication by A,-and -- =- 
- -B-=A.+ D, Du = (u, 8) g.- : ae sie of ee 

We can regard L?(S) as a closed subspace of L?(R) = H. We extend A to H as 
multiplication by À, and Du as (u, g) g, where g is defined as zero on the complement 
of S. It suffices to prove the existence of W+ on the extended space. 
_ Wë prove the existenče öf the wave operators first in the case where g is a smooth 
function, it and its derivatives rapidly dectease as A tends to +00. We appeal to 
theorem’ 5; we choose the dense subset J to consist of all smooth functions u (x) 
that decrease rapidly together with their derivatives as à > +00. We verify now the 
three hypotheses of theorem 5: 

(i) The domain of A and B consists of functions u (2) for which Au(A) is square 
integrable; hypothesis (i) is clearly satisfied. 


(ii) 


Dee WA y = ie u, g) g= f u(u)glu)du gA) 


=u g(—t) 80), (12) 


where ~ denotes the Fourier transform. Since both u and g decay rapidly at infinity, 
ug is in L!, and so its Fourier transform is continuous, as required in (ii). Since 
the derivatives of u and g decay rapidly, u g decays faster than any power of 7 as 
t — +00; therefore requirement (iii) is satisfied. So we conclude that the generalized 
wave operators W4 (A, B) exist. 

To pass to arbitrary-g, we make use of the identity (11). Replacing De~"4 u in 
this formula by its expression in formula (12), we get 


co ~~ . 
|W. u — Wa) ull? = —2Im f u g(t) (eiA W3 g, u) dt. (13) 


a 


Abbreviate Wi. g as g*; then we can rewrite the second factor in the integral above as 
(ci g*, u) = f e'l À g* OJT) dà = g(t). 
We estimate the integral on the right in (13) by the Schwarz inequality: 


oO Soi 1/2 
[Wau — W(a) ull? <2 (/ ju g|? at f Brk ar) 
a a 


EXISTENCE OF THE WAVE OPERATORS 485 


By Parseval’s identity the Fourier transform is an isometric mapping times 27r. So 
the second factor on right side above is less than 


an | ig" ul? dd < 27 uh fie" ar, 


where Julo is the maximum of |(A)|{ on R. Furthermore 
J iP aa = igri? = IW si? < tal? 


since WÑ, being the adjoint of the isometry W+, has norm equal to 1. Putting these 
inequalities together, we get 


œo __ ,\l/4 
[W4 u — W(a) ull < (87)!/* Jul? (J juz P) ; 
a 
A similar inequality holds for a replaced by b; so by the triangle inequality 


|W(b) u — Wha) ul] < (87) 4 ju 


o 1/4 o 1/4 
ij ju g Par] +I) luz Par] | 
a b 


(13’) 


This inequality was proved under the assumption the u and g are smooth and rapidly 
decreasing. We show now that it holds for all g that are merely square integrable. 
To this end we approximate in the L?-norm any given g in L?(R) bya sequence of 


8n that are smooth and rapidly decreasing at co. Inequality (13’) holds for gy, with ` 


W(t) = ef Bat e7itA where Ba = A+ (+, 8n) 8n- AS 8n —> g, the right side of 
this inequality tends to the right side of (13’); we claim that so does the left side. 
To show that, we have to demonstrate that eB»! tends to eiB! ; this is easily done. 
eBt yy =u (£) and e'Brt = ut, (t) are the solutions of the differential equations 


d d 
EP u — iBu = 0, er ün — iBn un =0, (0) = uy, (0) =u, 


where u belongs to the domain of B. Subtracting the equations, we get 


d 
iat. Si prea Seed doe iB(u — un) =i(B > Br) tn- 


Acting on both sides by e~!!8 and integrating, we get 


ae, 
u(s) — un(s) =i f e6-DB (B — Bp) undt. 
0 


Since ei- B has norm 1 and {fun (t)|| < [|u|], we deduce that 


lu(s) — un (s)| < s]B-B, ll lull. 


486 SCATTERING THEORY 


AS gn tends to g, |B — Bn l| = II, g) 2 — (, gn) gn|| tends to zero; this completes 
the proof that eB»! tends to e’®! in the uniform topology; strong convergence would 


have been enough. This completes the proof that the inequality ( 13’) i is valid for all 


g in L?, and all smooth u that decrease rapidly at oo. 

The right side of (13’) tends to zero as a and b tend to oo; therefore so does the 
left side. This proves the convergence of W(t) u as t tends to oo. Since the smooth, 
rapidly decreasing functions u are dense in L?(R), and since the operators W(r) 


__have norm 1, it follows that the limit exists for all u in L*(R), and therefore the 


generalized wave operator W4 (B, A) exists. Since the role of A and B is symmetric, 
the wave operator W4 (A, B) also exists; this shows that the absolutely continuous 
parts of A and B are unitarily equivalent. 

“We pause-for a moment to-point out that even-when the operator A has absolutely 
continuous spectrum, a rank one perturbation can create a point spectrum of B. Take 
A to be multiplication by x acting on L?(R), and D = (, f) f, f in L? and smooth 
(Lipschitz continuity will do); we are looking for an eigenfunction u of A + D, with 
eigenvalue of t: 


xu(x) + (u, f) f(x) = rt u(x). 
Solve for u(x): 


u(x) = ap f(x). 


In order for u to be square integrable, t has to be a zero of f. Taking the scalar 
product of both sides with f gives, after canceling the factor (u, f), the relation 


f(x) j 


t= 


sa 


a second condition, in-addition to f(t) = 0. If both conditions are satisfied, A +D 
has t as an eigenvalue. 


Exercise 7. Show that f can be so chosen that A+-D has eigenvalues, n any natural 
number. Can B have infinitely many eigenvalues? 


The eigenvalue t can disappear at the slightest change in the function f; for al- 
though the changed f will have a zero near t, the second integral condition will 
in general not be satisfied. This phenomenon is called the instability of the point 
spectrum embedded in the continuous spectrum. 

We return now to complete the proof of theorem 7. 

Let D be a symmetric operator of trace class. Denote its eigenvalues by dg, the 
corresponding normalized eigenvector by fy. By (20) of chapter 30, 


[Dll > $- Idel, 


EXISTENCE OF THE WAVE OPERATORS 487 


and D itself can be expressed as 


D=) dl, fe) fe. 


Denote by D, the finite sum 
n 
Dn = > del, fo) fe- 
l 


Define B, as A + D}. By what we have already proved, the generalized wave oper- 
ators W4 (Bn, Bn—1) exist. Therefore (see exercise 3) the wave operator W.(Bn, A) 
exists; we denote it as W,4. We use now formula (11): 


o0 
Wn u — Wn (a) ul? = —21m | (at Wi Dn eA u, u) dt. 
a 


Using the definition of Dp, we can write this as _ 


co n 


Wap — W, (a) ull? = -2 Im i Ya (a u, fe) Ge Wiss fs u) dt. 
a 1 à 


Using the Schwarz inequality, first with respect to the sum, then the integral, we get 


n co 7 2 ue 
| Was u — Wala) ull <2 2 drl J (e*u s)| r| 
l a 
a 


[Ee f] (eA feal a:| f 


where fë is an abbreviation for Wr, fk. As we saw earlier, || f% < If lle = 1. 
The vector u for which we want to prove the existence of the strong limit W (t) u 
as t —> co belong to the absolutely continuous subspace for the operator A, so 
——--u-= -Po u+ Replace-u- by Pe u-on-the right in (14); using the fact that Pe commutes 
with eÏ, we can rewrite the right side of (14) with fg and fë replaced by Pe fy 
and Pe Te- Rather than rewriting (14), we just assume that fg and f¥ belong to the 
absolutely continuous subspace of A. 


~-newas-follows:-——-~ 
(eeu, f) = fatu, f) 
and 


(eA f* ou) = | Paer. 


Next-we-use-the-spectral-resolution- of-A-to-rewrite-the-terms on-the-right in (14) iN 


488 SCATTERING THEORY 


Since u, f, and f* belong to the absolutely continuous subspace of A, we can rewrite 
the integrals on the right in terms of the Radon-Nikodym derivative as 


ce sa ne 
[E Eu na etc. (15) 


-By the Schwarz inequality, 


Ai oiana ia d = 1/2 
z Ee f) < E (Eu, u) zy Ef, n| + 


By Parseval’s formula, 


Exercise 8. Prove this inequality. Lees 
Suppose that the vector u has the property that 
d 
sup — (Eu, u) < œ. (16) 
À dì 


Then, by the Schwarz inequality, 


? d 1/2 
7 E(u, f) <m E (Ef, n| : 


where sit is the square root of the sup (16). This shows that d (Eu, f)/dà is square 
integrable, and so by Parseval’s theorem, (15) is square integrable as function of t. 

Denote the function (15) as n(r), and the corresponding integral with f replaced 
by f* as 7*(t). Setting these in (14), we get 


2 


n o — 2 pfa co 
Waru- Walaa <2 | 7a | mPa bar / ed 
] a ] a 


“(14’) 


2 
dr 


#)2 12 d k 
mPas f In. [dt = an f f Ef*,u 
f k g ifs Hk : P dz. ( u) 


d . s 
< 2r m f FA (Ef*, f*) dà = 2m m? | f* I? < 27 m?. 


~ So we can rewrite (14’) as 


1/2 
miD; 04”) 


tr >? 


[Wau — Wr (a) ul? < (82)!/? b Idx | Í ` mPa 
] a 


here we have used the fact that $`} [dk] < ||Dllr. Replacing a by b and using the 
triangle inequality, we get from (14”) that 


EXISTENCE OF THE WAVE OPERATORS 489 


[Wa (b) u — Wn (a) ull < 


[fe o | J fa > \/ 
(8x Dlr)" m! [Em | ma [Ea | ma 
1 q l 


Now we pass to the limit as n tends to oo. Since B, = A + D, tends uniformly to 
B = A + D, W, (a), Wn (b) tend uniformly to W(a), W (b). The right side tends to 
the infinite sum. In the resulting inequality we let a and b tend to oo. We claim that 
the right side tends to zero. This follows from these facts: 


4 


CO 
O f MOP di < 2mm 
-00 
Gi) $ ldk] < 00. 
oo 
(iii) lim l Ingl? dt =0. 
a—> 00 a 
This proves the existence of the strong limit W (b) u as b tends to co. 
This conclusion is valid for those vectors u that satisfy (16). The last step of the 
proof is to verify that the set of such u is dense in H). This is not hard; let v be 


any vector in H (©). For such a v, the measure (Ev, v) is absolutely continuous with 
respect to Lebesgue measure, so we can write 


2 d 
c= 1 ‘; = — (EF : lÀ. 
lvl fa v, v) fE Enna 
d(Ev, v)/dà is nonnegative and in L!. Denote by Sm the set of A where 
d(Ev, v)/dà > m. Set um = (1I — E(Sm)) v; denote by Hm the measure 
(Eum, Um). 


If we denote the measure (Ev, v) by yz, the two measures are related by dum = 
(1 — cs)dy, where cy is the characteristic function of the set S. It follows that the 
Radon-Nikodym derivatives are similarly related: 


E Eom o Um) = | = (l - = = cs) <u. w) 


Since Sm was chosen as the set where du/dà > m, it follows that dum/dà > m < 


_____m forall A. The vectors vm tend to v as m tends to co. This completes the proof __ 


__that the vectors wu that satisfy (16) are dense in H) (c) . Since the domain of the wave 
operator is a closed subspace of HO, , this completes the proof that the generalized 
wave operator W (B, A) exists when A and B differ by.an operator of trace class. 

o 


_Of course, the wave operator W_(B, A) also exists, and since the A and B enter 
the hypothesis symmetrically, W+ (A, B) also exist. Therefore the absolutely contin- 
uous parts of A and B are unitarily equivalent. 


490) i SCATTERING THEORY 


37.4 THE INVARIANCE OF WAVE OPERATORS 


The scope of the main result of section 37.3 is greatly extended by the following 


~~ result: Let @(A) be a real valued function with the following properties: . 


~~") ¢ is piecewise differentiable. 


oa ‘pis posteveye continuous and of bounded variation nee Ny: 


Theorem 8 (Birman-Kato). Let A and B be a pair of self-adjoint operators on a 
Hilbert space that differ by an operator of trace class. Let @ be a function as above. 
.. Then the wave operator W.(.(B), o(A)).exist.and.are independent of ¢. . .. 
Since W3.(6(A), @(B)) also exist, the absolute continuous parts of (A) and 
@(B) are unitarily equivalent. 


Even when ¢ is not monotonic, the existence of W+ (¢ (B), @(A)) can be shown. 
Only in this case these wave operators are no longer equal to W+ (B, A) but are 
composites of them. 

For a proof of these result, see chapter X of Kato’s book. 


37.5 POTENTIAL SCATTERING 


Asan Saupe of the abstract theory developed in section 37.3, take the Hilbert space 
-H = L? (R°), A = —A, the negative Laplace operator, and B = —A +q, q a real- 
valued function. We take the simplest case when the potential q is a square integrable 
bounded function. 


- Theorem 9. For q bounded and square integrable, the wave operators W+(B, A) 


exist, where A = —A, B= —A a a 
Proof. Fourier transformation is the peal representation for —A, and shows 


that its spectrum is absolutely continuous, fills R, and is of infinite multiplicity. We 
appeal now to theorem 5 in section 37.3. We take for the dense subset J the linear 


2 f : 
span of functions of the form e~%~®"/2, a in R3. For u of this form, we can solve 
the equation u; = —i Au = i Au explicitly by Fourier transform: 


T, =i, KO) = eB tiat, 
so R(E, t) = e7 @it+1)§"/2+ia-€ Taking the Fourier inverse gives 
u(x,t) = e} A u(x) = (1 + 2i1)73/? en (xa)? / (244i). 


Clearly, this function belong to the common domain of A and B, as required in (i). 
Conditions (ii) and (iii) are also fulfilled, since 


THE SCATTERING OPERATOR 491 
IB- Aye" Au] = qu, DI < lall 11 + 2i 73 


is a continuous function of t, integrable over the whole ¢ axis. 

It remains to show that the linear span of the functions e~(*~-a)"/2 ig dense in 
L?(R3). According to the spanning criterion(see theorem 7 of chapter 6) we have to 
show that a function f in L(IR%) that is orthogonal to all functions en (e~ay"/2 jg 
zero. To see this, we rewrite this condition as 


o= f fy +a dy = | elf Fee Fae; 


here in the first step we changed the variable of integration to x — a = y, and in the 
second step we use Parseval’s relation. The last equation says that the Fourier inverse 


of fen is identically zero. But then so is fe, and so is f(&) and f itself. 


We conclude from theorem 9 that — A is unitarily equivalent to B = —A+g acting 
on an invariant subspace of B. In particular, the continuous spectrum of —A + q 
contains the whole positive axis, of infinite multiplicity. 

The restriction that q be square integrable can be relaxed. For sharper result, see 
chapter X of Kato’s book, and the literature quoted there. However, if q is merely 
bounded, then with probability 1, A + q has a pure point spectrum. This result, of 
importance in solid state physics, is called Anderson localization; for example, see 
Frohlich and Spencer. 

On the other hand, if the potential g(x) tends to zero fast enough as |x| — co, 
one can show that (A — A)7! and (A — B)~! differ by a trace class operator when 
à < 0. Then it follows from theorem 8 that W+ (A, B) and W(B, A) exist, and so 
—A and the absolutely continuous part of —A + q are unitarily equivalent. 


37.6 THE SCATTERING OPERATOR 


Suppose that B is a perturbation of A, and that the generalized wave operators 


W2..(B, A) exist and map the absolutely continuous- part-of-A-onto-the-absolutely-- ~- 


continuous part of B. Then, by (9’), 


W,A=BW, and W.A = BW.. 


W2! WLA = W'BW. = AWIW. 


In words, WL wy commutes with A. This es WZ Wahi is called the scat- 
tering operator; and is denoted as S. - 
The physical significance of the scattering operator is this: 


we 


- From: these-relations,-we-deduce-that——-——----—~ -mumni ne rent gt oe a e 


492 SCATTERING THEORY 


Think of A and B as an unperturbed and perturbed Schrédinger operators, where 
for large distances the perturbation is negligible. The operators discussed in sec- 
tion 37.5 form such a pair, for the potential g(x) tends to zero as x tends to oo. For 


large positive times, most of the signal has propagated to large distances;-so-that-the—— 


signal e''3 y differs very little from a signal governed by the unperturbed equation, 
call it eA uz. Similarly, for t large, negative e''B y differs very little from a signal 
eiA u—. Letting ¢ tend to +co, we deduce that 


up = lim eA eiB y = Woly 
1—00 


ANd. www ews Be arina ae ol EESE 


x 


u—= lim eA eiB y — wo! y, 
fap 


The operator linking u— and u4, WZ! W4, is the scattering operator. Thus the 
scattering operator links the state of the perturbed system in the remote past to its 
state in the dim future. l 

This time-dependent picture of the scattering process has been described in 1945 
by Møller. A stationary picture has been formulated by John Wheeler in 1937. It 
was elaborated by Heisenberg in 1943; Heisenberg’s motivation was that a physical 
theory should only deal with observable quantities. The forces acting on electrons 
surrounding a nucleus cannot be measured; a physical experiment measures only 
the outcome, taking place at £ = oo on the atomic time scale, and compares it to 
the setup, the state of the system at t = —oo. The task of scattering theory is to 
reconstruct the atomic forces from the scattering operator. This is no place to say 
anything about this fascinating problem, another gift bestowed on mathematics by ` 
physics; we refer to the review article of Ludvi g Faddeev, and to volume 2 of Reed 
and Simon. — Lo tt hk eg et a 

Since S commutes with A, the natural description of S is in a spectral representa- 
tion of A, where S acts a multiplication operator. We will elaborate this in a slightly 
different setting in the next section. 


HISTORICAL NOTE. In 1930 Heisenberg lectured at the University of Chicago on 
the new quantum mechanics. His assistant there was the young American physicist 
Frank Hoyt, who helped prepare the En glish lecture notes. During the Second World 
War Hoyt joined the Manhattan project for building nuclear weapons; one of his 
assignments was to scrutinize every wartime publication of Heisenberg and see if it 
could be a by-product of bomb research. Hoyt studied very thoroughly the two papers 
Heisenberg published in 1943 on scattering theory, and as he told me later, concluded 
that they had no bearing on nuclear weapons. This may have saved Heisenberg’s 
life, for the OSS, the wartime precurser of the CIA, had been training an agent to 
assassinate him. 


THE LAX-PHILLIPS SCATTERING THEORY 493 
37.7 THE LAX-PHILLIPS SCATTERING THEORY 


The setting is the same as in section 4 of chapter 36: a unitary group U(t) acting on 
a separable Hilbert space H, and a pair of incoming and outgoing subspaces F_ and 
F+, orthogonal to each other, each satisfying properties (27a)-(27c), respectively 
(28a)—(28c) 


U(t) map F_ into F_ fort <0, (17a) 
The intersection of U(t) F_ is {0}. (17b) 
The union of U (t) F_ is dense in H, (17c) 


and similarly for F+, changing the sign of t in (i). 

We appeal now to the translation representation theorem 7 of chapter 35, which 
says that if the above three conditions are satisfied, then H can be represented iso- 
metrically as L?(N, R), so that the action of U(t) is represented by translation to 
the right by t. Furthermore the incoming subspace F- is represented by L? (N, R_). 
Similarly, since F is an outgoing subspace, there is an outgoing representation of 
H as L?(N, R), where U(t) is represented by translation, and Fi is represented by 
L?(N, R+). Since the dimensions of the auxiliary spaces N appearing in the incom- 
ing and outgoing representations are equal to the multiplicity of the spectrum of the 
generator of the group U(ż), the two auxiliary spaces can be taken to be the same. 

Let u be any vector in H, k_ and ką its incoming and outgoing representers. We 
define S to be the operator relating the two: 


Sk. = k}. (18) 


We call S the scattering operator associated with U(t), F_ and Fy. 

In chapter II of Lax and Phillips’s Scattering Theory, we show how to construct 
an unperturbed group Up of unitary operators so that the scattering operator, defined 
in section 37.6 in terms of the wave operators linking Up and U, is the same as the 
one defined in (18). 


Theorem 10. Let U(t) be a unitary group of operators, F_ and F4 orthogonal in- . 
coming and outgoing subspaces, S the scattering operator defined in (18): 


(i) S is unitary. 
(ii) S commutes with translation. 


(iti) S maps LAN, Ra) into itself. 
Proof. sige Eea 


(i) Since both k— and k+ represent u isometrically, lk- = llull = llk+ll, so S 
is an isometry. Since it maps L? (N, R) onto L? (N, R), it is unitary. 


(ii) Since k— (x — t) and k4 (x — t) both represent U(r) u, S maps the translate of 
k_ onto the same translate of k+. 


494 SCATTERING THEORY 


Gii) Any k- in LN, IR_) is the incoming representation of a vector u in F_. 
Such a vector u is orthogonal to F4; therefore its outgoing representation 
ką is orthogonal to the outgoing representation of F}, whichis L?(N, Ri). 
This shows that ky. belongs to L?(N; R2) =-=- 


Property (iii) is called causality, and can be put in the following words: The value 
of k4 on R4 depends on the value of k— only on R4. 
The adjoint S* of S has analogous properties: ia A ae tag 
(i) S* is unitary. 
(ii) S* commutes with translation. 
(iii). S* maps L?(N, R4) into itself. 


Let u be a vector in H, k_ and k+ its incoming and outgoing translation repre- 
sentations. The Fourier transforms of k— and k+, denoted as f- and f+, are the 
incoming and outgoing spectral representations of u. We denote their relation as 


Sf- = f. (19) 
Theorem 10. 


(i') Sis unitary; 
(ii!) S commutes with multiplication by bounded, measurable functions; 
(iii!) S maps the Fourier transform of L? (N, R—) into itself. 


Proof. (i') follows from part (i) of theorem 10, since S is unitary and so is the 
Fourier transform. 

Gii’) The Fourier transform transmutes translation by an amount a into multiplica- 
tion by e!4*. Therefore, by part (ii) of theorem 10, S commutes with multiplication 
by e!4*, Given any bounded, measurable function b(A), we can approximate it by a 


__ sequence by of the form by (A) = 7} cj e'4/+, so that lim b, (A) = b(A) a.e., and 


E so that the functions b, are uniformly bounded on R. It follows that for any f in 


LN, R), bn f tends to bf, and b, Sf tends to bSf, in the L?(N, R) norm. Since S 
is a bounded operator, Sb, f tends to Sbf. Since Sbn f = bnS f, Gii’) follows. 
(iii’) is a restatement of part (iii) of theorem 10 in the spectral representation. O 


Let k be an L? function supported on R_. Then its Fourier transform f can be 
extended as an analytic function to the lower half C_ of the complex plane, ¢ = 
A+in, n <0, by the formula 


i ape ig 

—— k(x) e?* dx. (20) 
Var im ( 

Theorem 11 (Paley-Wiener). The Fourier transform f of a function k in L?(N, R—) 


is a vector-valued function in L?(N,R), that has an analytic extension f(t) into 
C, with the following properties: 


FE) = 


THE LAX-PHILLIPS SCATTERING THEORY 495 


(i) For fixed n < 0, fO. + in) is a vector valued L? functon of à. As n tends to 
=œ, || f(- +i) || tends to 0. 


(ii) As n tends to 0, f («+ in) tends to f in the L?-norm. 


Conversely, any function f with properties (i) and (ii) is the Fourier transform of 
-an L?(N, R) function. 


A proof for scalar-valued functions, employing nothing more than the Cauchy 
integral theorem, is presented in chapter 38. An extension to the vector-valued case 
is, as so often, straightforward. g 


We will denote the Fourier transform of L? (N, R_) as H_, and the Fourier trans- 
form of L?(N, R+) as H4. H+ can be characterized as consisting of vector-valued 
functions in L?(N, R) that have analytic extension into the upper half-plane C+, 
with properties analogous to those enumerated in theorem 11. 


Theorem 12. The operator S defined in (19) can be realized as multiplication by an 
operator-valued function M(A), mapping N into N. 


(i) M(A) is unitary for almost all À. 
(ii) M(A) is the boundary value of an operator valued function M(t) in holo- 
morphic C_. 
(iii) For each ¢ in C_, M(€) is a contraction, mapping N — N. 


The function M(¢) is called the scattering matrix. 


Proof. We want first to tackle (ii) and (iii). Let u be any vector in F_. According 
to part (iii’) of theorem 10’, its incoming and outgoing spectral representations f— 
and f+ both belong to H_. So f— and f+ are vector-valued analytic functions in C_. 

We now show that for any ¢ in C_ and any f- in Ë, the value of fe) 
determined by the value of f_(¢). To prove this, it suffices to show that if f- (t) = 
then ie 0. We factor such an f_ as 


FI ts eE p eaa a a a 
À —— N 
F = Ft g€ o 
It follows from the Paley-Wiener theorem, theorem 11, that g belongs to H_. Since 
by-theorem-104-S-commutes-with-multiplication-by-functions-bounded-on-R;--------—— 


ee ae aa ee = <9 
A=S f- = S — 
f+ J. T Sg 
Since g belongs to H_, by theorem 10’ so does Sg; setting à = ¢ in the relation 
above shows that f;(¢) = 0. 
f+(¢) is related to f_(¢) by a linear mapping of N -=> N; we denote it as M(¢): 


MCE) F-E) = f+). (21) 


496 SCATTERING THEORY 


To show that M(¢) is strongly analytic, take f_(A) = n/(A—i), n any vector 
in N. Clearly, f— belongs to H_; therefore so does f+. Set this pair in (21): 


] 
pay = f+(S); 


since f+(¢) is analytic in C_, so is M(¢)n. 
(iii) Take any ¢ in C—, any vector n in N,anddefine 


k4(x) = | i on as (22) 
For any positive r, H 
ka —r) =e en forx <r. (22') 
Set (22) into (22): 
k4 @œ =r) — ei?" k(x) =0 forx <0. (23) 
Define 
k~ = S* k4. (24) 


Since S* commutes with translation, and since S* maps L2(N, R+) into itself, we 
deduce from (23) that for all positive r, 


k(x —r)—e78"k_(x) =0 for x <0. (23/) 
This implies that 
k(x) =el$*m for x <0, i (22) 


where m is some vector in N. We define as before the incoming and outgoing spectral 
representations as 


f-=Fk, fe=Fhy, 


where F is the Fourier transformation. Using formulas (22) and (22”) we get 


(25) 


and 


THE LAX-PHILLIPS SCATTERING THEORY 497 
where a+ is the Fourier transform of k_ restricted to R+. Therefore a+ belongs 


to Hi. 
Let p be any vector in N. Using formula (25), we get 


(4 E) MPN 
“TGF Jin J (A+D)A42) 


The calculus of residues gives 


p\_ p(n PN | 
(e) FENE (RAI 
Similarly, using (25’), we get 
—i (m, p)N (a4 (À), PIN 4 
— SO —— dA + —_——_ + 
(7 a) V20 Q+tOATD. A+E 


Since a+ belongs to H+, we can in the second integral on the right shift the path 
of integration from the real axis to A + ix, « > 0. Estimating the integral by the 
Schwarz inequality, we see that it tends to zero as x tends to co. So we obtain 


( Ph ) = Jin ee. (26") 


The Fourier transform of (24) is f- = S* f+. We write, using this and (25), that 


(ee f-)= Ge S* fs) 
= (s(x): fr) = zl Ca ")y aT 


Since p/(A + ¢) belongs to H_, so does S(p/A + ¢); thus the integrand above is 
meromorphic in the lower half-plane, with a simple pole at — ¢. We shift the path of 
integration from the real axis to A + ix and let « tënd to —cd7 We obtain; using (21); mm" 
that 


fa (rpi t) pP. E (27) 
ee ears 


Samiel this with 26": ee 
(m, p)w = (n, M(—£) P)y = (M*(-0)n, P)y 
Since this holds for all p-in N, 


m= M* (—C)n. (28) 


498 SCATTERING THEORY 


We will now estimate the norm of m. Applying the Schwarz inequality to the left 
side of (26'), we get 


Pea aa hom, P)N|S Mell ews (29) 


Using the definition of f—-as Fk_, (24) and.that_S* is a.unitary operator, we. get. 


I-I = ik- = S* kah = kal B0) 


Using the definition (22) of ki. ai 


test? = Í mi Je igx|? 


Setting (30), (30’), and (30”) into the right side of (29) gives 


——— |n|?,. (30) 


1 
dx hli = Simti 


By calculus, 


dà 


p ERAT 30” 
A+ cP Ipli. (30°) 


: 
IpIn = 


Tazi 


lm. P)n| < In| Ipin- 


Since this holds for all p in N, it follows that |m|y < Jani. In light of (28), 
[M*(—f)|y < 1, which implies that |M(—7)|y < 1, as asserted in (iii). 

(M(t) n, p) is a bounded analytic function in C_. According to a basic result of 
the theory of analytic functions, 


: lim. (MOi n) n-p)y Op (31) 


7-70 


exists for a.a. real A. Take for n and p a denumerable dense set of vector in N; since 
|M(¢)| < 1, it follows that the limit (31) exists for all n and p in N, for almost all A. 
Denote the weak limit (31) by M(A); clearly, |M(A)| < 1 ae. 

Let n be any vector in N. The function f_(A) = 1/(A — i) belongs to H; there- 
fore it is the incoming spectral representation of some function u in H_. The outgo- 
ing spectral representation f+ of u also belongs to H_. Set = A+ in in (21): 


rero (At+in)n = f4 (à +in), n <0, - (21') 
A+in—- 


and let 7 tend to zero. The right side tends to f+ in the L*(N, R) norm; therefore, so 
does the left side. It is not hard to deduce from this that M (A + in) n tends strongly 


to M(A) n, for a.a. real à. 


THE ZEROS OF THE SCATTERING MATRIX , 499 


Since S is an isometry, we deduce from (21’) that 


n 
A= 


IP=] 


2 1 1 
| =m | aas isl? = [ IM(A) nj; di. 


We noted earlier that |M(A)| < 1 for a.a. A, so it follows from the above that 
IM(A) nly = |n|y for a.a. À. 

The operator S is multiplication by M(A); therefore the operator S* is multipli- 
cation by M*(A). Since S* is an isometry, it follows that M/*(A) is an isometry 
for a.a. 4. We conclude that M(A) is unitary for a.a. A. This completes the proof of 
theorem 12. O 


Note that M*(A) is the boundary value of the function M*(¢), holomorphic for 
¢ in Ch. 


37.8 THE ZEROS OF THE SCATTERING MATRIX 


We recall from chapter 36, section 36.4, the Lax-Phillips semigroup 
Z(t) = P4 U(t) P, 


where P_, P are projections onto the orthogonal complements of F_ and Fi, a 
pair of incoming and outgoing subspaces, orthogonal to each other. The semi group 
Z(t) acts on the space K = HO F- 9 F}. 

Since the semigroup Z and the scattering matrix have the same ingredients, there 
is bound to be some relation between them. Here it is; denote the infinitesimal gen- 
erator of Z(t) by G: 


Theorem 13. A complex number y, Rey < 0, belongs to the point spectrum of G if 
and only if M* (iV) has a nontrivial nullspace. 


Proof. Let u be an eigenvector of G: 


Gu=yu, Zthuse’'u. (32) 


Let k+ be the outgoing translation representation of u. Since u belongs to K, it is 
orthogonal to F+; therefore k+ is zero on Ry. In the outgoing representation Z(t) 
~--acts-as-translation-by ¢;-followed- by-restriction-to-the-negative-axis:-so-(32)-becomes————_ 


k(x —t) = e”! k(x), . x < 0, t So 
It follows that j 
_je’*n forx <0 
ee) = 0 for0 < x; 


n some vector in N. 


500 SCATTERING THEORY 


The outgoing spectral representation of u is the Fourier transform of k: 
n 1 


À F k4 = =—. 
Ja )= + = CT ik zy’ OS pig TA 


Since MT! (A) = M* (à) for A real, the incoming spectral representation is, by (21), 


f-Q)= ; (A) nt. Ope ate Io) Seis 
Since u belongs to K, itis orthogonal to F_; therefore f_ is orthogonal to H—. Con- 
sequently f- belongs to H+, and thus has an analytic extension to C4. Formula (33) 
gives a meromorphic extension of f— to C; it is analytic iff the: ‘potential pole at . sen 
à = —i y is cancelled by a zero of M*(¢) n at ¢ = —iy: — 


Mi vn =0 


The reverse of this argument gives the converse proposition. 0 


The proof presented above gives a little more: the dimension of the nullspace of 
G — y I equals the dimension of the nullspace of MM*(i 7). More generally, one can 
show: 


Theorem 13’. A complex number y belongs to the resolvent set of G iff S(i F) is 
invertible. a aa 


Proof. For a proof we refer to section 3, chapter ITI of Lax and Phillips’s Scatter- 
ing Theory. 


According to theorem 12, the scattering matrix (A) can be extended analyti- 
cally into the lower half-plane C_. Suppose that M (À) is continuous in the norm 
topology along an interval J of the real axis. Then M can be continued analytically 
across 7 by the operator version of the Schwarz reflection principle: 


M(t) = ME! (34) 


for ¢ in Cy near 7. 


37.9 THE AUTOMORPHIC WAVE EQUATION 


Faddeev and Pavlov have given a beautiful application of the Lax-Phillips scattering 
theory to automorphic solutions of the wave equation in the hyperbolic plane. The 
Poincaré model of the hyperbolic plane Hi is the upper half-plane (x, y), y > 0, 
equipped with the Riemannian metric 


2 dx? + dy? 


ds 
y? 


(35) 


THE AUTOMORPHIC WAVE EQUATION 501 


The isometries, called hyperbolic motions, can be expressed elegantly using the com- 

plex variable z = x + iy as 
az+b 

ce ENA 
cz+d 


N 


a,b,c,d real, ad—bc=1. (36) 


Exercise 9. Show that the metric (35) is invariant under the hyperbolic motions (36). 


The group G of hyperbolic motions has many interesting discrete subgroups I, 
which have the property that the images of any point under the mapping in T ac- 
cumulate only at oo. A function u(x, y) is called automorphic with respect to the 
subgroup I if u(y (x, y)) = u(x, y) for every motion y contained in T. 

A domain P in His called a fundamental domain for T if 


(i) every point in H can be mapped into a point of P by some y i 
(ii) no two points of P are mapped into each other by any y in T. \ 


A boundary point of P will be mapped into another boundary point 
in F. 


Exercise 10. Show that the image of a fundamental domain by any y inT is another 
fundamental domain. 


A fundamental domain is called a fundamental polygon if its boundary consists of 
a finite number of geodesics. The geodesics in the Poincaré model are circles whose 
center is located on the line y = 0, and their limits, the lines x = const 

The discrete subgroups that are amenable to scattering theory have fundamen- 
tal polygons that are unbounded. In this section we will look at the simplest such 
subgroup, the modular group T consisting of all hyperbolic motions of form (36), 
where a, b, c,d are integers. Clearly, these form a subgroup, and it is not hard to 
show that this subgroup is discrete. A fundamental domain for the modular group 


is the geodesic triangle T bounded by the geodesic arcs x = +4, y> 3, and 


2y =l, <x h, 


Exercise 11. Draw a picture of T. 


Exercise 12. Look up a proof of the fact that T is a fundamental domain for the 
modular group. 


Exercise 13. Show that the modular group is generated by the two transformations 
z— z+ landz — —1/z. 


Exercise 14. Show that the fundamental triangle T has finite hyperbolic area. Cal- 
culate its area. 


502 : ` ; SCATTERING THEORY 


~- The motion z > z + 1 carries the side x = -4 of T onto the other side x = 5. 
The motion z + —1/z carries the third side x? + y? = 1 onto itself, mapping the 
point (x, y) to (—x, y). Denote by p and p’ pairs of boundary points linked to each 
other by a mapping in the modular group. A C! automorphic function u satisfies the 


u(p)=u(p'),  un(p) = —un(p") (37) 
E at corresponding boundary points of T, where Un denotes the outward normal deriva- 
tive. 
The Laplace-Beltrami operator in the Poincaré model is 


An = —y (8; + 85). (38) 
Exercise 15. Show that Ay is invariant under the hyperbolic motions (36). 


Denote by (, )7 the L? scalar product over T with respect to the hyperbolic area 
element: 


dxdy 
(nor = fuv na (39) 
T y Š 


Let u be a C? automorphic function that is zero for y near oo. Integration by parts 
yields 


(Apu, u)r = i (uz + u3) dx dy; (40) 
F 


the boundary term is zero because of the boundary conditions (37). 

Formula (40) shows that the operator Ap, defined for all automorphic function is 
symmetric and nonnegative. Its Friedrichs extension, also denoted as Aq, is a self- 
adjoint operator. What is its spectrum? It turns out that it is more natural to renor- 
malize Ay as 


L= Ag- il (41) 
Theorem 14. 


(i) On the interval [~1, 0] the spectrum of L consists of the single point -4. 


(ii) L has infinitely many positive eigenvalues, accumulating at oo. The odd 
eigenfunctions span the space of odd functions in T. 


(iii) L has absolutely continuous spectrum of multiplicity 1 on R4. 


Proof. We will not give a complete proof of part (i), for it would take us too far 
afield. 


THE AUTOMORPHIC WAVE EQUATION 7 503 


(i) Using formula (40) and the definition of L, we write 


(Lu, or= f f Ç +u? -i)e dy. (42) 


Let a be any number > 2. We divide T into two parts, T = Ta UT, Ta denoting 
the part of T below y = a, T° the part above. Let u(y) be C! function, equal to zero 
for y near oo; integration by parts yields 


ay u N? œf > uuy u 
iy SS dy= uy — — + — d 
l ( : z) s J (: y j? 


(43) 


Define (y) = (2y — a)/a; since g(a) = 1, y(a/2) = 0, y’ = 2/a, we can express 


a a 
u? (a) = f dy (ou2)dy = | (g'u? + 2puuy) dy 
aj? a/2 


2 a 5 a 5 a 1/2 
< zf u" dy +2 / dy | uy? dy ; 

a Jaj2 a/2 a/2 

2 a 3 1 a 5 a 7 
<- u° dy + -— u° dy +a uy” dy. 

a Jan @ Jan af2 


In the third step we have used the Schwarz inequality, and in the last the arithmetic- 
geometric inequality. Since y < a in the range of integration, we deduce that 


u° (a) < a f (= +u J (44) 
a/2 


—~ Lette; y) bear €?-automorphic function that is zero for y near co. We integrate 
inequality (44) with respect to x over [-4, 5]; we obtain 


= (er morire f t å) aef fà} 5+ 55) dx dye -- 


(44’) 
Integrating (43) with respect to x, we obtain 


2 
2 ul l 2 
I. (4 _ 5) dx dy > = fe (x, a) dx. (43^) 


Combine (43^) and (44’): 


[A 


504 SCATTERING THEORY 


MGR eHe I rales « 


Denote the integrand o on the right i in 1 (42) by q, split the integral into two parts, and 
use inequality (45): 


ane | fravay +f [ q di ay y f 
. R . EET iy x a eee Py we eee ce 
2 u? 
> . uy = 4y2 dxdy+ paasa 
Fa 3u? l> l l me of eer ue 
> If. (« — 22 — D dx dy = If. (1 + qty _ 42 dxdy. 


Define the quadratic functional K as 


2u? 
K (u) = -~y dx dy, 
Ta Y 


and add it to both sides: 


(Lu, u) + K (u) > C(u), (46) 


3 1 5 1u? i 
C(u) = If. (1 + zi + 75) dx dy. 


It follows from Rellich’s compactness theorem that for any positive e there is a sub- 
_ space of u of finite codimension on which K (u) < eC(u). Taking, modestly, €e = 1, | 

we conclude from (46) that (Lu, u) > 0 on ech a subspace. It follows from this 

and (41) that the spectral resolution of L on [-7. 0] has finite-dimensional range; 


where 


therefore the spectrum of L over [-4, 0] consists of a finite number of eigenvalues. 

Since T has finite area, u = 1 is square integrable, and an eigenfunction of L with 
eigenvalue —t. There are in fact no others in [-4. 0], but we skip the proof. 

(ii) Both the operator L and the fundamental domain T are invariant under reflec- 
tion across the y axis: x —> —x. It follows that the domain of L can be reduced as the 
direct sum of even and odd automorphic functions. For odd functions the first condi- 
tion in (37) becomes u = 0 on the boundary; the second condition is automatically 
satisfied. 

We will show that under the Dirichlet boundary condition u = 0, the resolvent 
(L + 1)7! is a compact operator. To see this, denote (L + I)~!w = u; this means 
that 


Lu +u = w. 


THE AUTOMORPHIC WAVE EQUATION 505 


Take the scalar product of both sides with u, and on the left side use identity (42). 
Estimate the right side by the Schwarz inequality; we get 


If. (uy + uy) dx dy + tull} < 5llwil}. (47) 


Since u(x, y) vanishes at x = +4 for y > 1, by Wirtinger’s inequality 


2 2 2 
[eas ex fa dx. 


Integrating this with respect to y over [Y, co] gives for Y > 1, 


o0 5 5 x 5 
J fe dxdy < nf fe dx dy. 
Y Y i 


From this we deduce that 
1 co 2 poo 
<5f [Pax dy < | fe dx dy. (48) 
Y-Jy Y+ Jy 


a fe pial) 
y? 


We claim that the image of the ball ||w||7 < 1 under (L + D7! isa precompact set: 
(47) shows that for functions u in this image the square integrals of uy and wy over 
T are uniformly bounded. We use Rellich’s compactness criterion (see theorem 2 
in chapter 22), applied to the compact portion Ty of T. Combined with (48), the 
uniform smallness of the hyperbolic L?-norm of u over the remainder TY of T shows 
the precompactness of the set of u in the norm |{u\|7. It follows then from the spectral 
theory of compact symmetric operators (see chapter 28) that (L+1)~! has a complete 
set of eigenfunctions over the space of odd functions in T, and that the eigenvalues 
are real, positive, and tend to zero. The corresponding eigenvalues of L tend to co. 

(iii) It is in this part of theorem 14 that the renormalization of the Laplace-Beltrami 
operator plays an important role. We employ the hyperbolic wave equation 


Bases eee 


introduced by Faddeev.and Pavlov. The conservation of energy can be derived in the 
usual fashion by taking the scalar product of (49) with ur. We get 


Id- 
zo [lur u)r + (Lu, u)r) = 0.. 


aa Fp 


We conclude that the conserved energy is 


2 
Er (u) = (u, u)r + (Lu, u)r = ff: (5 +u pu — 5) dxdy; (50) 


here we have used (42). 


gi 


506 ; SCATTERING THEORY 


The bilinear functional associated with the quadratic functional Er is 


echt Sa a to Dan ee ONT, gs cat tah OO) 
-© Solutions of the hyperbolic wave equation are uniquely determined by their initial 
data {u(0), u;(0)}. Since the operator L is invariant under hyperbolic-motions y;if--~ 
.--U{Z, t) is a solution of (49), defined forall zin H, u(y(z),t)-too is a solution. If the 
_ Initial data of 4 are automorphic, then u(z, t) and u(y (z). t) haye the same initial _ 
data and therefore are equal. In other words, if the initial data of a solution of the 
hyperbolic wave equation are automorphic, the solution u(t, z) is automorphic for 
all t. . 
:-—Denote.by U(t) the operator relating automorphic initial data {u(0), u;(0)} of fi--—-— 
nite energy in T to data at time t, {u(t), u; (t)}Since energy is conserved, Ep (u(t) = 
Er (u(0))- 


Exercise 16. Show that Er (u(t), v(t)) = Er (u(0), v(0)) for all pairs of automor- 
phic solutions of finite energy in T. 


According to part (i) of theorem 14, the spectrum of L on [-4, Q] consists of 
the single eigenvalue ae with eigenfunction = 1. It follows that if (u,1)7 = 0, 
energy defined in (50) is positive. We claim that if the initial data of a solution u are 
orthogonal to 1, 


(0), Ir =0 = (u,(0), Ir, 


then u(t) is orthogonal to 1 for all t. This follows since (u(t), 1)7 satisfies the 
second-order equation 


3 Ul), Dr = (ur, Dr = (Lu, 1)r = —@, Li)r = 1, 1)r. 


Denote by H the space of all automorphic initial data with finite energy in T that 
are orthogonal to all the eigenfunctions of L. The operators U(t) map H into itself. 
Since energy is positive for such data, we define the square root of energy to be the ` 
norm in H. The operators U(r) are unitary in the energy norm. 

We will construct a pair of representations of H as L?(R) that transmute the action 
of U(r) into translation. Let h = {h1, ho} be an element of H, u(x, y, £) the solution 
of the wave equation with initial data h. Denote by i(y, t) the x-average of u: 


1/2 
u(y, t) = f u(x, y,t)dx. 
-1/2 


Using the automorphic boundary condition (37), we get for the x-average of the wave 
equation (49) 


Z 2 X 
Un — y Uyy — qu = 0. 


THE AUTOMORPHIC WAVE EQUATION 507 


The change of variables t = y!/2y, y = es tums this equation into the classical 
wave equation 


Urt — Uss =Q. 
This can be factored as 
(ðr + ðs) (ur — vs) = 0; 
it follows that v; — vs is a function of s — t. We define 
VIk4 (8) = vs — v, = ðse™™ h (e°) — 5/7 hig (e°) (51) 
to be the outgoing translation representation of the initial data {h,, A2} = h. Clearly, 
the outgoing representation of U(t)h is k(s — t). 
We next show that every function in L*(R) appears as the representer of some 


element in H, and that the representation is an isometry. Take any smooth function m 
of compact support in 1 < y, and define w+ on T as 


w+(y,t) = y!/?m(log y — 1). (52) 


Fort > 0, w4 (y, t) satisfies the matching conditions (37) on the boundary of T, and 
therefore can be extended as an automorphic function to the whole hyperbolic plane. 
In T, the initial data h+ of wa are ha = (hy, ho}, where 


hy =y!/*m(logy), ha = —y!/?m' (log y). (52') 

For s > 0, the outgoing representation of (11, 2} is, according to formula (51), 
J2k4.(s) = ðsm(s) +m’ (s) = 2m’. (51) 
Since wi(y,t) = 0 for O < tand1 < y < e’, the representer k+(s, £) of 
{w+(y, t), ðw (y, t)} is zero for 0 < s < t. Since k4 (s, t) = k (s — t), it follows 


that k+ (s) = 0 for s < 0. In other words, (51°) holds for all real s. Using formula 
(50), we calculate the.energy.of wp:  - REN TIE, 


00 | (ym) /1 m m \* ym 


2 72 $ a 
-j z =k o= f prem s= mds =e. 
ae y ) 

Clearly, Er (w) = ||k+1|?; in words: 


For solutions w of form (52), the translation representation (51) is an isometry. 


This explains the need for the factor JVZ in formula (51). 


508 SCATTERING THEORY 


„The initial data h4..of w+ do not, in general,. belong to H; to make them ee 
onal to the eigenfunction t, we apply the projection Q: 


PEES ons, memea OD iaeo ho—d}y -—----—---- A aae AAN 


c and d being constants determined by the equations 


rae EEY ETANTE AT dy dy- 
[io 
T y 


Using the formulas (52’) for h] and ho, we can rewrite these equations as 


PN OEST, Ere ete? - 
| y'Pmlogy)— = cA, — ji ym dea) = =dA, 
JI ys 1 
where A is the area of T. Switching to s = log y as variable of integration yields 
oo 3 oo 
f e™ P m(s)ds=cA, — f elm! (s)ds = dA. (53) 
0 Oo | 


Integration by parts shows that d = —c/2. 


Lemma 15. 


(i) Qh+ has the same energy as h4. 
(ii) Qh4 has the same translation representation as h4. 


Proof. (i) By definition (50), 


ho — d)? l 
Err) = ff (225 yt 8 7 
ee T A y 7 4y2 ee 
h 2 
= Er(hy) — 2a [| axay reas’ f|. iias- Fa. 


By formula (53), this can be rewritten as 
c2 
Er(h4) — d?’ A + Ti 


which is equal to Ep (h+), since d = —c/2. 
(ii) By definition (51), the difference between the outgoing translation representa- 
tion of hy and of Qh is 


VUBE Pe — ed) = — Ve? (F +d) = 0. D 


Se et a 


THE AUTOMORPHIC WAVE EQUATION 509 


Exercise 17. Show that for t > 0 the solution of the hyperbolic wave equation with 
initial data Qh is w4 (t) — ce~"/?. 


Clearly, the functions Qh are orthogonal to all odd eigenfunctions. We claim 
that they also are orthogonal to all even square integrable eigenfunctions p of L 
with positive eigenvalue. To see this, take the x-average of the eigenvalue equation 


Lp = uD: 


Solutions of this equation are linear combinations of the functions y“'/2)+# and 
y(/2)-it | Neither of these is square integrable with respect to dy/y* near y = 00, 
whereas p, and therefore P, are. So it follows that p = 0, from which we have the 
orthogonality of p to wi. 

The functions k+ (s) that represent data in H include all functions of form m’(s), 
m any c3 function supported on R+ and their translates; these are dense in L? (R). 
Denote by K the subspace of the corresponding data in H. K is a closed invariant 
subspace for the operators U(t), which are represented as translations on R. It is 
not hard to show that L has absolutely continuous spectrum on K, which covers all 
of R. O 


In chapter 36, section 36.4, we introduced the concept of an outgoing subspace 
F for a group of unitary operators U(r), as follows: 


(i) U(t) Fy C Fy fort > 0. 
(i) NU(t) Fy = {0}. 
(iii) UU) Fi = H. 


We claim that the space formed by the initial data Qh, where A1, h2 are given by 
formula (51), is such an outgoing subspace F+. Properties (i) and (ii) are obvious. 
Property (iii) asserts that K, the closure of the union of U(t) F+, is all of H; a short 
proof can be found in Lax and Phillips’s paper in the Transactions of the AMS. 

It is worth pointing out that we have constructed the translation representation 
directly, without appealing to the translation representation theorem in chapter 35, 
section 35.5. 

... The theory- described_above. for. the-modular_group-can_be_carried-overwith-only.__. 
trivial changes to any discrete subgroup, except that in general, one.cannot determine... 
precisely the location of the point spectrum. According to a conjecture/theorem of 
Phillips, Sarnak and Wolpert, there is in general, no point spectrum embedded in the 
continuous spectrum. 

If the fundamental polygon has n vertices at infinity, the continuous spectrum of L 
has multiplicity n on R+. Even if a whole side of the fundamental polygon lies at 
infinity, the theory can be pushed through, with a continuous spectrum of infinite 
multiplicity. 


510 SCATTERING THEORY 


There is an entirely analogous construction ‘ofan incoming representation; the 
___incoming subspace F_ is formed by Qh_, where h— denotes the initial data on T 
of incoming solutions w- of form w—(y, t) = = y!/?n(log } y +1), t < 0, where n is 
“supported on K}. TTT 


Exercise 18. Show that for t< O;the-solution of-the- hyperbolic: wave equation with 


__ initial data. Qh is w (t) —.ce’ 1/2 


Exercise 19. Show that F4 and F_ are orthogonal in the energy norm. 


Clearly, w. = y!/ 2n (log y+), t < 0, describes a wave arriving in T from 
infinity through the channel ~l< <x < 5, justaswyis-fort>-0 a wave traveling -- 
to infinity through the same channel. It follows from property (iii) for F_ and F4} 
that everything that flows in from infinity eventually flows. out to infinity. How fast 
is an interesting question, as we will show. 

It was explained in section 37.8 that a pair of orthogonal i incoming and outgo- 
ing translation representations are linked to each other by a scattering operator. The 
corresponding spectral representations are related via multiplication by a scatter- 
ing matrix M(A). In the present situation, where the multiplicity of the continuous 
spectrum is one, the scattering matrix is a scalar function. |M(A)| = 1 for A real, 
and M has an analytic continuation into the lower halfplane à + in, n < 0, where 
|M + in)| < 1. In chapter 38 we will meet these functions again in Beurling’s 
theory of the arithmetic of bounded analytic functions. 

Faddeev and Pavlov have determined the scalar scattering matrix that arises from 
the hyperbolic wave equation for solutions that are automorphic with respect to the 
modular group. It is, aside of inessential factors, 


spat So. TOT) 
MUIS c(1 qan 


“where F(A) is a product of gamma functions, and ¢ is the Riemann zeta function. 


If the Riemann hypothesis is true, M has zeros in the lower half-plane on the line 
— li + À; its meromorphic continuation into the upper half-plane by Schwarz reflec- 
tion has poles on the line li + 4. We saw in section 37.8, theorems 13 and 13’, that 
if A + in is a zero of M, then 7 + iA is an eigenvalue of G, the generator of the 
semigroup Z(t) associated with the group of unitary operators U(r) and the pair of 
incoming and outgoing subspaces F_ and F+. Faddeev and Pavlov point out that if 
one could show that G has no eigenvalues y whose real part exceeds —1, the Rie- 
mann hypothesis would be proved. According to Phillips’s spectral mapping theorem 
for semigroups, theorem 12 in chapter 34, if y belongs to the spectrum of G, e” be- 
longs to the spectrum of Z(t). Since the spectral radius does not exceed the norm, 
lev] < Z(t) || z. Taking logarithm and the limit t + oo, we deduce that 


1 
Rey < lim —log|Z(t)lle. 
t-+0o f 


BIBLIOGRAPHY ‘ 511 


Therefore to prove the Riemann hypothesis, it is sufficient to show that 


(54) 


Ale 


l 
lim — log |Z le < 
too f 


Faddeev and Pavlov point out that (54) is necessary as well. It is not hard to show 
that (54) would follow if 


l 1 
lim — log ||Z(HAlle < - 
im z log lZ()Alle < 7 


could be proved for a set of data dense in the domain of the semigroup. 
Could this formulation lead to a proof of the Riemann hypothesis? If it does, you 
will hear about it. 


BIBLIOGRAPHY 


Beardon, A. F. The Geometry of Discrete Groups. Graduate Texts in Mathematics, 91. Springer-Verlag, 
1983. i 


Birman, M. Sh. A test for the existence of the wave operators. Dokl. Akad. Nauk. SSSR, 147 (1962): 
506-509. 


Cook, J. M. Convergence to the Møller wave matrix. J. Math. Phys., 36 (1957): 82-87. 


Faddeev, L. D. The inverse problem in the quantum theory of scattering. Usp. Mat. Nauk., 14, 57 (1959); 
English translation by B. Seckler, J. Math. Phys., 4 (1963): 72—104. 


Faddeev, L. D. and Pavlov, B. S. Scattering theory and automorphic functions. Seminar Steklov Math, Inst. 
Leningrad, 27 (1972): 161-193. 


Fröhlich, J. and Spencer, T. A rigorous approach to Anderson localisation. Common Trends in Particle 
and Condensed Matter Physics. Les Houches, 1983; Phys. Rev., 103 (1984): 1—4, 9-25. 


Heisenberg, W. Die beobachtbaren Grössen in der Theorie der Elementarteilchen. Z. Physik, 120 (1943): 
1, 513-538; H, 673-702. 


Jauch, J. M. Theory of the scattering operator. Helv. Phys. Acta, 31 (1958): 127-158. 
= Kato, T. Wave operators and unitary equivalence. Pacific J. Math., 15 (1965): 171-180. 
Kato, T. Perturbation of continuous spectra by trace class operators. Proc. Jap. Acad., 33 (1957): 260-264. 


Kato, T. Perturbation Theory for Linear Operators. Grundlehren der Math. Wiss in Einzeldarstellung, 
132. Springer, Verlag, 1966. 


ooe""Kuroda, S. T. On a theorem of Weyl-von Neumann. Proc. Jap. Acad., 34 (1958); -IS T TTT T TTT 
Kuroda, S. T. On the existence and the unitary property of the scattering operator. Nuovo Cimento, 12 
(1959): 431-454. . 
Lax, P. D. and Phillips, R. S. Scattering Theory. Academic Press, New York, 1967. 


Lax, P. D. and Phillips, R. S. Scattering Theory for Automorphic Functions. Ann. Math. Studies, Princeton 
University Press, Princeton, 1976. 


Lax, P. D. and Phillips, R. S. Translation representation for automorphic solutions of the wave equation in 
non-Euclidean spaces. CPAM, 37 (1984): 303-328, 780-813. 


512 SCATTERING THEORY 


Lax, P. D. and Phillips, R. S. Translation representation for automorphic solutions of the wave equation in 
non-Euclidean cases; the case of finite volume. Trans. AMS, 289 (1985): 715-735. 


Møller, C. General properties of the characteristic matrix in the theory of elementary particles. Kgl. Dansk. 
Videnskab, Selskab, -Mat.-fys. Medd., 22, 1 (1945); 23, 10 (1946). - 


‘von Neumann, J. Characterisierung des Spectrums eines Integraloperators. Actualités Sci. Ind., 229 
(1935): 38-55. ` 


Phillips, R. S. and Sarnak, P. Perturbation theory for the Laplacean on automorphic functions. J. AMS, 5 
(1992): 1-3. 


i Reed, M. and Simon, B. Scattering Theory. Academic Press, New York, 1979. 
Rellich, F. Störungstheorie der Spectralzerlegung. Math. An., 113 (1937): 600-619, 677-685; 116 (1939): 
555-570; 117 (1940): 356-382; 118 (1942): 462-484. 


-> Rosenblum; M. -Perturbation of the continuous spectrum and unitary: equivalence. Pacific J. Math., 7_. 


(1957): 997-1010. * 


Shatten, R. A Theory of Cross Spaces. Ann. Math. Studies, 26. Princeton University Press, Princeton, 
1950. 


Weyl, H. Uber beschränkte quadratische Formen deren Differenz vollstatig ist. Rend. Circ. Palermo, 27 
(1909): 373-392. 


Wolpert, S. A. Disappearance of cusp forms in special families. An. Math., (2), 139 (1994): 239-291. 


A THEOREM OF BEURLING 


38.1 THE HARDY SPACE 


In this chapter we study the space of square integrable analytic functions in relation 
to the algebra of bounded analytic function. 

The Hilbert space (Z4) consists of vectors x = (ag, a@j,...), a j complex num- 
bers such that 


(x? = J laj? < o. (1) 

This space can be represented as a space of analytic functions f(z) in the unit disk: 
co 

FR) =>) anz". (2) 
0 


In this representation the space is called Hardy space and is denoted as H4. We have 
already come across this space in chapter 27, section 27.2. 
The L?-norm of f on any circle of radius r < 1 is 


ee oe dé 
J Lire a8 a, do = 5 3) 


We define aoe Set ue 


If? = sup f Le) Pao B’) 


as the norm in H+, isometric with (1). The difference f (rei?) — f (set?) has norm 


Wf (rel?) — f (sel)? = Y an l” — 5"), 


which shows that f (re!) converges as r —> 1 in the L? sense. This limit is the 
boundary value of the function f(z) on the unit circle, 


Fel) = anal”, (4) 
0 


514 A THEOREM OF BEURLING 


where the series on the right converges in the L? sense. Its L? norm is the H-norm 
of f: 


P= J Ife?) 2a0. (4" 


Denote by B the algebra. of bounded analytic functions i in the open unit disk. 
ee Define me norm ple to Ibe the’ sup norm: © 


= sp te ee Se 


Functions b in B have boundary values in the sense of convergence a.e. as r — 1. 


Theorem 1. 


(i) If the boundary values of a function f in Hs are bounded, then f belongs 
to B. 

(ii) Let b denote a function B, and denote by B: H} —> H, the operation of 
multiplying a function f in H by b. We claim that B is bounded, and that 


Bil = Jbl. (6) 


Proof. (i) Let se be a sequence of smooth functions approximating the 6 function 
(see chapter 11, section 11.11). For any function f of class H+, define fe by 


fe(z) = J f (ze!®) selo) do. (7) 


Clearly, fe is of class H4 and tends to f in the Hi-norm as e€ tends to zero. Fur- 
thermore fe(z) is continuous in the closed unit disk. If the boundary values of |f| 
are bounded by, say, 1, so are the boundary values of fe. According to the maximum 
value principle, at any interior point z of the unit disk | f-(z)| < 1, but then also their 
L? limit satisfies | f (z)| < 1 at all points with |z| < 1. 

(ii) Using the definition (3’) of the norm in H+, we deduce that for any function b 
in B, [bf || < |b] || FI]. It follows that 


[Bll < Ibl. (6) 


To see that the sign of equality holds, we argue as follows: if not, we would have, 
after renormalizing b, 


IBI < 1 < |b]. 


It would follow that {|B”|| < ||B||” tends to zero as n —> oo. On the other hand, for 
any f in H+, 


/ Ib" (re?) fee)? do 


BEURLING’S THEOREM 515 


tends to co as n —> oo if r is so large that maxg |b(re!®)| > 1. This shows, by (3’), 
that |B” f || —> co, a contradiction. Oo 


Exercise 1. Show that every bounded mapping C: Hi —> H4 that commutes with 
multiplication by all functions in B is itself multiplication by some function c in B. 


Exercise 2. Show that B has no divisors of zero; that is, if the product be of two 
functions b and c in B is zero, then one of the factors b or c is zero. 


38.2 BEURLING’S THEOREM 


The basic result of this chapter, due to Arne Beurling, establishes an important rela- 
tion between the Hilbert space H and the algebra B: 


Theorem 2. Let N be a closed subspace of H4. that is invariant under multiplication 
by functions b in B, namely that bN C N for all b in B. Then N can be represented 
as 


N= pH, (8) 
where p is a function in B that has absolute value I on the unit circle: 
Ip(e"*)| = 1; (9) 


p is unique up to a complex constant factor of absolute value 1. 
The beautiful proof below is due to Paul Halmos. 


Proof. Itis easy to see that every N of form (8) is invariant under multiplication by 
b in B. It follows from (4) and (9) that multiplication by p is an isometry; therefore 
N is closed. 

Conversely, consider any given closed invariant subspace N. We claim that zN is 
a proper subspace of N; for if not, every f in N could be written as 


f=h=lh=-, 


which would show that f has a zero of infinite order at z = 0, impossible for an 
analytic func hon ee ee ee oes 

Multiplication by z is an isometry of H+; therefore zN is a closed proper subspace... 
of H4. Denote its orthogonal complement in N by M: 


N=MO@uN. . (10) 


Replace N on the right by its orthogonal decomposition given by (10); k-fold repeti- 
tion of this operation gives 


N=MƏzM®--- zTM zN. (10) 


516 A THEOREM OF BEURLING 


Letting & tend to co, we deduce from (10’) that 


NOIMO2MEG-:-. (11) 


We claim that the right side of (11) is actually equal to N; for if not, there would be a o 


g in N that is orthogonal to every z/ M. But by (10’), such a g would belong to z* N 
-for every k, and thus would have a zero of infinite order at z = 0. This is impossible, 
and so 


N=M@iM@.---. (12) 


Next we examine the space M. Let m be any function in M; it follows from (10’) 


————------that m-is-orthogonal to ain , k > 1, and so in particular, to zk. 


(zkm,m) = f etime ao =0, k=1,2,... (13) 


Taking complex conjugates we conclude that (13) holds for k = —1, —2,... as well. 
_ Thus all Fourier coefficients of |m(e! 9))2 except the Oth are = 0, which implies that 
[m(e! 8) is constant. 
We claim that M is one dimensional. To see this, let m and p be two functions 
in M; then m + ap, a any constant, belongs to M, and so by what has been shown 
above, for z = e!?, 


jm + ap} = (m+ ap)\(m +āp) = m|? + jaļ?ipl? +2 Re apm = const.. 


Since a is an arbitrary complex constant, pm is constant; dividing by mii, we con- 
clude that p/m is constant, that is, that p and m are proportional. 


Normalize p(e?) in M to have |p| = 1; then all functions in M are constant 
multiples of p. Setting this into (12) shows that every function f in N can be decom- 
posed as 

f = aop + zajp + °+="p(ag + ajz+-++) = pg. (14) 


Since |p(e!%)| = 1, i f(e!%)| = Je (eif); since f belongs to H+, so does g. Thus 
(14) is the desired representation (8) of Beurling’s theorem. 0 


Exercise 3. Show that p is uniquely determined by N, up to a constant factor of 
absolute value 1. 


A function p in the algebra B that has absolute value 1 on the unit circle is called 
an inner function. 

Note that in the proof of theorem 2 we only used the fact that N is invariant under 
multiplication by z. This is no gain in generality, as shown in 


Exercise 4. Show that any closed subspace N of H that is invariant under multi- 
plication by z is invariant under multiplication by any function in B. 


BEURLING’S THEOREM 517 


Beurling has shown how to use theorem 2 to factor bounded analytic functions: 
Theorem 3. Every function b in B can be factored essentially uniquely as 
b= pu, (15) 
where p and u are bounded analytic functions, |p(e!’) = 1, and uH is dense in H. 


Proof. Define N to be the closure of b H+. Clearly, N is invariant under multipli- 
cation by z, so by Beurling’s theorem N is of the form N = pH. Since b belongs 
to N, it is of the form pu, u in H. Since |p(z)| = 1 for |z| = 1, |b(z)| = Ju(z)I; 
since b is bounded, so is u. 

Denote the closure of a set S in H+ as S. By definition, N = bH4; using the 
factored form of b, we can write this as N = pudH. Since |p| = 1 for |z| = 1, 
multiplication by p is an isometry, so 


N = puH = pul. 


Since N is of the form pH, we deduce from this that uH, = H+. That p and u are 
unique up to constants of absolute value | follows from the uniqueness of p in the 
representation (8). E 


The function p is called the inner factor of b, u its outer factor. 

The product of two inner factors is clearly an inner factor, and it is not hard to 
show that the product of two outer factors is outer. It follows from this that if by and 
b are a pair of bounded analytic functions, the inner factor of the product b; bz is the 
product of the inner factors of b}, and bo, and likewise for their outer factors. This 
shows that to establish that a bounded analytic function is divisible by another, it is 
sufficient to verify the divisibility of its inner and outer factors by those of the other 
function. 

Divisibility by an outer factor is a particularly simple matter: 


Theorem 4. Let b be any function in B, u an outer function. Then b is divisible by u 
‘in the algebra B iff b(z)/u(z) is a bounded function on the unit circle |z| = 1. 


Proof. By definition of outer factor, uH is dense in H+. In particular, there is a 
sequence {cn} in H+ such that _ 


L? —limuc, = 1. (16) 


Suppose b/u is bounded on the unit circle. We claim that the sequence {bcn} con- 
verges in H+. To see this, we write bcn = bucy,/u; since b/u is bounded on the unit 
circle and uc, converges in H4 to 1, we deduce that {bc,} converges in L? on the 
unit circle to a limit we denote by d: 


L? —limbe, =d. (16^) 


E 


518 A THEOREM OF BEURLING 


"The functions be, belong to H4; therefore so does théir H limit d. 
-—— + -Multiply (16) by the bounded function u; the resulting sequence converges in H4 


Serer neem trom me ces Brea: -r2 i be; i= ait. > : (16”) 


On the other hand, using (16), we conclude that the sequence on the left in (16”) 
converges to b. This proves that b/u.= d belongs to H+. Since, by hypothesis, b/u 
“is bounded On. the unit Circle, it follows from part (i) of theorem 1 that b/u belongs 
oB SE = a eee ne o 


The simplicity of the criterion for division by an outer function shows that they 
may be called quasi-units of the algebra B. 
We turn now to divisibility by inner functions: 


Theorem 5. An inner function p divides a function b in B iff pH contains bH.. 


Proof. Clearly, if b = pc, then bH} = pcH is contained in pH. Conversely, 
if bH is contained in pH, then b = b - 1 is of the form pf, f in H4. This shows 
that b/p belongs to H+. But, since b is bounded and p has absolute value 1 on the 
boundary, their quotient is bounded. This shows according to theorem 1 that b/p 
belongs to B. a 


In this proof the closed invariant subspaces of H4 replace the role of ideals in the 
algebra 3; Beurling’s theorem is an analogue of the principal ideal theorem. We will 
exploit this idea further. 


Definition. Let b and c be two functions in B. Denote by N the closure of bH} + 
cH}: 


bH +cH4} =N. (17) 


According to theorem 2, N is of the form r H+, r an inner function. We define r to 
be the greatest common divisor of b and c. 


The g.c.d. defined above has the usual properties: 


Theorem 6. Suppose that b, c, and s belong to B, and that s is an inner function 
that divides both b and c in B. Then s divides the g.c.d. of b and c. 


Proof. According to theorem 5, s divides b iff sH contains bH+, and divides c 
iff s H4 contains cH. It follows that then the linear space s H+ contains bH +c H4. 
Since s H+ is closed, 


SHs D bH +cH4, (18) 


but the right side is rH, r the g.c.d. of b, c. Appealing again to theorem 5, we 
conclude that s divides r. H 


BEURLING’S THEOREM 519 


Definition. Two functions b and c in B are relatively prime if their g.c.d. is 1. Ac- 
cording to definition (17) this means that b and c are relatively prime iff b H4} +cH, 
is dense in H+. 


Theorem 7. Letc, d and e be three functions in B, e relatively prime to both c and d. 
Then e is relatively prime to their product bc. 


Proof. By definition of relatively prime, e H+ + cH and eH, +dH4 are dense 
in H+. Then d(e H+ + cH) is dense in d H4; therefore eH, + d(e H} +cH +) = 
eH. + dcH4 is dense in H+. This proves that e and dc are relatively prime. a 


These results on divisibility in the algebra 8 can be used to develop a theory of 
primes in 3: 


Theorem 8. Let u be any point in the unit disk: |u| < 1. The function 


Zou 


p(z) = (19) 


uz—1 


is a prime inner function of the algebra B. 


Proof. An elementary calculation shows that |p(z)| = 1 for |z| = 1, so p given 
by (19) is an inner function. To see that it is a prime, we note that since p vanishes 
atz = u, so does every function in N = pH. Conversely, if f in H4 vanishes at 
z = u, then f/p belongs to H+, so f belongs to N = pH... This shows that pH; 
consists exactly of those functions in H+ that vanish at u. It follows that N has 
codimension 1 in H+. This implies that p is a prime. To see this, suppose that q is 
an inner function that divides p; according to theorem 5, then q H+ contains pH. 
Since pH has codimension 1, q H+ is either pH or H+; in either case q is, by 
theorem 3, a trivial factor of p. o 


We will see a little later that the functions (19) are the only primes in 8. However, 
B contains in addition prime powers. These are technically easier to analyze if we 
- switch-from the-unit-disk to-the-upper half-plane. The transformation 


w-—i 


Z = |m 4 
Em AN (20) 
~ maps the upper half-plane Im w > 0 onto the unit disk |z] < 1. The relation — ~ 
w-i 
= i 20 
g(w) (=) (20°) 


assigns a bounded analytic function g in the upper half-plane to every bounded an- 
alytic function f in the unit disk, and conversely. This relation is an isometric iso- 
morphism of these two normed algebras. We will use the same symbol 8 for either 
of these two isomorphic algebras. 


520 A THEOREM OF BEURLING 


-Theorem 9. The function 


pee an (21) 


is an inner function. Furthermore it is a prime power in the sense that its only fac- 
torization by inner functions is 


Proof, Obviously |p(w)| < 1 forImw > 0, = 1 for Imw = O. Suppose that 
there were a nontrivial factorization 


= el? = p(w) q(w), (22 


p and q inner functions. Denote the real and imaginary parts of w as x + iy. Take 
the log of the absolute value of (22'): 


—y = log |p| + log |q]. (22") 

We define 
h = — log |p| (23) 

and claim: 


(i) h is harmoniċ in the upper half-plane. 
(ii) h is positive in the upper half-plane and is continuous down to the boundary 
y = 0, where it is equal to zero. 


For it follows from (22’) that p and q have no zeros in the upper half-plane; this 
shows that h, the real part of the analytic function log p, is harmonic as asserted 
in (i). 

Since both p and q are nontrivial inner functions, |p| and |g| are < 1, and so 
log |p| and log |g|-<"O'in the upper half-plane: These inequalities combined with 
(22) give . 


0<h(x,y)<y. 


The assertions (ii) follow immediately from these inequalities. 

A harmonic function that vanishes on a straight line boundary can be continued 
by reflection across that boundary as a regular harmonic function. Thus A can be 
extended to the whole plane by setting 


h(x, =y) = —A(x, y). 


We apply now the Poisson formula, see equation (29’) in chapter 11, to h, replacing 
the unit disk by a disk of radius R: 


20 R2 _ r? 
c, y) = c 2 
AEON I RP 2Rroosl@—g) 42 R ee 


ei" ee se OO deo. ae | 2. 


BEURLING’S THEOREM 521 


where (x, y) = r(cos@, sin@) and 
k(@, R) =h(Rcosġ, Rsind). (24') 


Since / is an odd function of y, k(@ +7, R) = —k(@, R), so we can rewrite (24) as 


w 
keys | QCR, r,0, o) k(b, Rao, 
0 (25) 
Q = P(R,r,0 —ġ)— P(R,r,8 +9), 


where P is the Poisson kernel appearing in the integral (24). Using the formula for P, 
we can express Q explicitly as 


(R? —r*)4Rr sin@ sin 


= 2 
Q (R2 — 2Rr cos(@ — ) + r2)(R2 — 2Rr cos(0 + p) + r?) (0) 
We write Q as 
Q = QCR, r, 8, ġ). (26°) 
The function Qo is positive for 0 < ġ < x and satisfies 
R,r,,0 
lim Qo(R, r1, 1, Q) (27) 


R->co Qo(R, ra 02,6) 


uniformly in the parameter @. We take now any two points (x1, y2) and (x2, y2) in 
the upper half-plane and use formula (25): 


Axi, y1) _ yı J QR: ri, 01,6) kG, R) do 
h(x2, y2) y2 J Qo(R, r2, 02, $) k(ġ, R) do 


. Now.we let R tend to oo; the integrands.in the numerator and denominator on the 
right in (28) are positive, and by (27) their quotient tends to 1. We conclude that the 
ratio of the integrals also tends to 1. Since the left side is independent of R, it follows 
that 


hy) y 
A(xa,y2) y2 


——This-proves-that h(x,_y)_= ay,a_>0_By.(23),| p(w). E S eiw, 


i ibn 


-similarly q (w).=.e!?" , b >_0,.as_asserted in (22)... sie o. q 


The analytic mapping w > (c — w)~!, c any real number carries the upper half- 
plane into itself. It follows therefore from (21) that the inner functions 


p(w) = exp{i(c — w)7!} (29) 


are prime powers in the sense of theorem 9. 


(28) 


` 522 A THEOREM OF BEURLING 


Next we will show that the functions (21) and (29) are the only prime powers 
by completing the discussion of AEA r E e ao -to-the-unit-disk--and-- 


normalization that makes them real and positive at z =0: 


pee va luy MG ee Cec ioe al ores eg 
uj Tee een ine Sous 


We appeal now to the analogous discussion in chapter 9, section 9.2, that shows. that 
the infinite product, called a Blaschke product, 


p@) =[ EP; fou) (31) 
converges in the unit disk, is an inner function, and divides b: 
b = pec, cin B. (32) 


The function c is bounded, |c| < |b]; it has no zeros, because all the zeros of b were 
thrown into p. Suppose |b| < 1; then — logc is an analytic function whose real part 
is positive; for such functions we have derived in chapter 11, section 11.6, an integral 
representation, equation (29'): 
ib, 
—loge(z) = | C, 0)dm, Cee u 


ei? — z (33) 


m being a nonnegative measure of finite total mass. Decompose m into its absolutely 
continuous and singular parts: 


m = Msing + Mac. 


The corresponding decomposition of c is 


c(z) = exp {- f Cdmsng} exp {- I Cama) . (33^) 


It is not too hard to show that the first factor on the right in (33’) is an inner function, 
the second an outer function. Setting (33’) into (32) gives a factorization of b: 


b = pexp l-J C dmsne} exp |- J Camac) z (34) 


This is the inner-outer factorization of b, the inner factor being the product of the first 
two factors in (34). The first of the two is a product of primes, in general infinitely 
many, the second is a mixture of discrete and continuous products of prime powers, 
smeared out with respect to a singular measure. 


THE TITCHMARSH CONVOLUTION THEOREM ` 523 


It follows from the representation (34) that, as claimed earlier, all primes are of 
the form 


uUu— z 


— 3 juj < 1, 
uz—1 


and that all prime powers are of form 


Bye. ied 
v=z 


We conclude this section by remarking that parts of this theory, in particular, 
Beurling’s theorem, can be carried over to vector-valued analytic functions that are 
square integrable, respectively, bounded in the unit disk or upper half-plane; see 
Halmos. The inner factors are operator valued analytic functions, unitary a.e. on the 
boundary, and < 1 in norm in the upper half-plane. In the language of chapter 37, 
section 37.7, scattering matrices are operator valued inner functions. 


38.3 THE TITCHMARSH CONVOLUTION THEOREM 
We give now an application of theorem 9; see Lax. We consider L! functions on the 
positive axis € > 0. We denote the lower end of the support of such a function F by 
£r. That is, 

lp = max{n: F(E) = Oforé < n} (35) 


An important theorem of Titchmarsh asserts: 
Theorem 10. Let A and B be two L! functions on R+, A « B their convolution: 
oS 
(A * BE) = f: Am) BE —n)dn. (36) 


Then 


-Lagg = bat Lgo GP- -- 


Proof. Forë < £a +£p the integrand on the right in (36) is zero, since at least one 
of the two factors is zero. Therefore the integral is zero, that is, (A + B)(&) = 0 for 


E< tq Ep this proves that the left side of 37) 1s >the tight side; what remains mmm 
> to be shown is:that-equality-holds:-To-do-this;-we-rely-on-a characterization duetos ===- 
_ Paley and Wiener of the lower end of the support of an L! (R+) function F in terms 
of its Fourier transform 


Fw) = l FẸ) eY dé. (38) 


Clearly, f (w) is a bounded analytic function in the upper half-plane. Next we need 


524 : A THEOREM OF BEURLING 


- Theorem 11 (Paley-Wiener). The function F in L! (Ri) vanishes on the interval * 
[0, £] iff its Fourier transform f (w) is < const. e79 in absolute value. in the upper 
half-plane Im w = y > 0. In the language of (35) we can express this as follows: 


£F = max{£: |f (w)e™i®”] < const. }. 
Proof. Suppose that F vanishes on the interval [0, £]; then we can write its Fourier 
transform (38) by setting £ = o + £ as ` k 


99 : P ses . 
i (w) =f F(E) elfw dé = dtu F(o + £) e!? do. 
t 0 


Clearly, f(w)e'™ is bounded-in the upper half-plane-- Conversely, suppose that™ ~~ 
f (we is bounded in the upper half-plane; we claim that F, the Fourier inverse 

of f. vanishes on [0, £]. We want to prove this by showing that for any smooth 
function G(€) supported on [0, £ — d], d any positive number, the scalar product 

(F, G) = 0. Denote the Fourier transform of G by g: 


t—d ; 
g(x) = Í GE) elf dg, | (39) 
According to Parseval’s formula 
where 
He) = f Fea ax, (40) 
From (39) we have l 
l f-d _ es ; 
a(x) = j! T(E) ea. G9) 


This formula shows that § (x) can be extended as an entire function h (w) to the whole 
complex plane 


t-d f 
h(w) = | Gé) $Y dé. 
0 


Since G is smooth, h(w) is bounded in the upper half-plane Imw = y > 0 by 
e—Ay 17] + |w|*). Since f is analytic in the upper half-plane, we can by Cauchy’s 
theorem shift the line of integration in (40) from the real axis to the line Imw = 
y>0: 


Cf, = f fars f fatha dr. (41) 


THE TITCHMARSH CONVOLUTION THEOREM §25 
By construction, 
e(l—d)y 
[AŒ +iy)| < PERS (42) 
and by hypothesis, 
| f (x +iy)| < const. e7®. (43) 


Using the estimates (42), (43) on the right in (41) shows that the right side of (41) 
tends to zero as y tends to oo. Since the left side is independent of y, it is zero. It 
follows from Parseval’s formula (40) that (F, G) = 0 for all smooth functions G 
supported on some subinterval of (0, £]. This proves that F vanishes on [0,@]. 0 


We can restate the Paley-Wiener theorem in the language of division in the alge- 
bra B: 


The lower end € of the support of a function F in L*(R4) is the highest power 
ofe” that divides in B the Fourier transform of F. 

We return to Titchmarsh’s convolution theorem 10. Denote the Fourier transform 
of A and B by a(w) and b(w); these are bounded analytic functions in the upper 
half-plane. The Fourier transform of the convolution A * B is the product ab. Using 
the Paley-Wiener theorem, we can reformulate the statement (37) as follows: 

Let la and £g denote the highest powers of e!” that divide the functions a(w) and 
b(w), respectively. The highest power of e” that divides their product ab is £4 +8p. 

To prove this, we factor a and b asa = effaWc b = eff8%g c and d in B. The 
functions c and d are relatively prime to e!”. Indeed, according to theorem 9, the 
only divisors of e!” are of the form e/*“, k > 0; on the other hand, c and d have 
no divisors of that form, because then a and b would be divisible by a higher power 
of e'” than stipulated. We appeal now to theorem 7 to conclude that the product cd 
is relatively prime to e". This proves that ab = ef(at€s)Y cd is not divisible by a 
power of e!” higher than €4 +g. ` Seah m 


NOTE. It is a curious fact that Titchmarsh’s convolution theorem is a result about 
functions of real variables, yet Titchmarsh’s original proof in 1924 used complex 
_-variable_theory._A..ceal_variables proof has been furnished only in 1952 by Ryll- 


Nardzewski, see Mikusinski. The proof described in this chapter shows that the ap- 
proach through complex variables is not unnatural. 

HISTORICAL NOTE. During the Second World War the British Secret Service had 

broken the “Enigma” code of the German armed forces. The advance information 

gained through intercepts was of decisive help in many battles. It is less well known 

that the Swedes also had broken the “Enigma” code; the mathematician who has led 

this effort was Arne Beurling. 


~ Halmos, P. Shifts on Hilbert spaces. Crelles J., 208 (1961): 102-112. 


Jerusalem; 1961, pp. 299-3067 = = ~ = 


526 . A THEOREM OF BEURLING 


The mathematician leading the British code breaker was the great logician Alan 
Turing. After the war he was prosecuted for homosexuality and hounded into suicide. 


BIBLIOGRAPHY 


Beurling, A. On two problems concerning linear transformations in Hilbert space. Acta Math., 18 (1949): 
239-255. 


Lax, P. D. Translation invariant spaces, Acta Math., 101 (1959): 163-178. 


Lax, P. D. Translation invariant spaces. in Proc. Int. Symp. on Linear Spaces. Israeli Acad. Press, 


Paley, R. E. A.C. and Wiener, N. Fourier transforms in the complex domain. AMS Coll. Publ., 19. Amer- 
ican Mathematical Society, New York, 1934, 


Mikusinski, J. Operational Calculus. Int. Series Monographs in Pure and Applied Math, 8. Pergamon 
Press, New York, 1959. 


Titchmarsh, E. C. The zeros of certain integral functions. Proc. London Math. Soc., 25 (1926): 283-302. 


TEXTS 


Banach, S., Théorie des opérations linéaires, Monografje Matematyczne, Warsaw, 1932, Chelsea, 1955. 
Brezis, H., Analyse fonctionelle, Théorie et application, Masson, 1983. 

Conway, J. B., A Course in Functional Analysis, Springer Verlag, 1985. 

Day, M. M., Normed Linear Spaces, Springer Verlag, 1962. 


Douglas, R. G., Banach Algebra Techniques in Operator Theory, Graduate texts in mathematics Vol. 179, 
Springer Verlag, 2nd ed. 1997. 


Dunford, N. and Schwartz, J. T., Linear Operators, Part I: General Theory, (1958), Part II: Spectral 
Theory, (1963), Wiley-Interscience Series on Pure and Applied Mathematics, John Wiley and Sons. 


Edwards, R. E., Functional Analysis: Theory and Applications, Holt, Rinehart and Winston, 1965. 
Hille, E. and Phillips, R. S., Functional Analysis and Semi-groups, Colloquium Publ. AMS, 1957. 


Johnson, W. B. and Lindenstrauss, Y., eds., Handbook on the Geometry of Banach Spaces, North Holland, 
to appear. 


Lindenstrauss, J. and Tzafriri, L., Classical Banach Spaces, 1 (1977), II (1979), Springer Verlag. 


Morrison, T. J., Functional Analysis: An Introduction to Banach Space Theory, Wiley-Interscience Series 
on Pure and Applied Mathematics, John Wiley and Sons, 2001. 


Reed, M. and Simon, B., Methods of Modern Mathematical Physics, I: Functional Analysis, Academic 
Press, 1972. 


Riesz, F. and Sz. Nagy, B., Leçons d’ analyse fonctionelle, Akadémiai Kiadó, 1952, Functional Analysis, 
F. Ungar, 1955. 


Rudin, W., Functional Analysis, McGraw-Hill, 1973. 
Schechter, M., Principles of Functional Analysis, Academic Press, 1971. 


Taylor, A.E. and Lay, D. C., Introduction to Functional Analysis, John Wiley and Sons, 1980. 


un 
nN 
q 


ee 


A 


RIESZ-KAKUTANI  _ 
REPRESENTATION THEOREM 


There is a conundrum in mathematical analysis similar to the chicken or the egg 
question: Which comes first, the Lebesgue integral or the Lebesgue measure? My 
answer is, neither; first comes the space L!. The traditional approaches enlarge the 
class of continuous functions, and then show this class to be sufficiently large, that 
is, complete in the L'-norm. The approach described in the pages to follow puts the 
horse before the cart. The object of our desire, L', is defined as an abstract space, the 
completion in the L'-norm of the space of continuous functions. Then each element 
of L! is identified as a down-to-earth function, defined almost everywhere. 

The theorem in the title states that every bounded, linear functional £ on the space 
C(Q) of continuous functions c on a compact Hausdorff space Q can be represented 
as an integral with respect to a signed finite measure on the o-algebra of the Borel 


sets of Q: 
l(c) = f cam. 


In this note we use functional analysis to give a simple and natural proof of this basic 
proposition in the case where Q is a compact metric space. 


A.1 POSITIVE LINEAR FUNCTIONALS 


We will study bounded linear functionals £ on the linear space C(Q) of continuous 


eal-valued-functions c on-a-compact metric-space. Boundedness means that—-.---—---------- 


€(c) < const. |c|max 


for all continuous functions c. A linear functional is called positive if @(c) > 0 for all 
nonnegative functions c. Note that a positive linear functional is monotone: c} < ca 
implies that £(c,) < £(c2). It follows that a positive linear functional is bounded. 
This is because every c in C(Q) satisfies c < |c|maxé, where u is the unit function 
on Q : u(q) = 1 for all q in Q; therefore, by monotonicity, €(c) < |clmax @(uw). 


The 
530 RIESZ-KAKUTANI REPRESENTATION THEOREM 


It is a standard result, and not hard to show, that every bounded linear functional 

on C(Q) can be decomposed as the difference of two positive linear functionals. 

— --—Therefore it suffices to prove the representation theorem for positive linear function- 
als; in this case the representing measure is positive... 

Given a positive linear functional £ on C (Q), we define the £-norm of a continu- 

ous functions as follows: 


ee er feiss i ae felp = (Ich), af jh es - = (1) 


where jc] is the absolute value of the function c : |e|(g) = |e(q)| for every point q 
of Q. 


The quantity |c|¢ is a semi-norm on the space C (Q). If we identify the continuous 


functions whose difference has £-norm 0, we get a genuine norm on the quotient 


space. We denote by L the completion in the £-norm of this quotient space. We 
recall that the elements of the completion are Cauchy sequences in the £-norm of 
continuous functions; two Cauchy sequences are lumped together if their difference 
is a null sequence in the £-norm. 

It follows from definition (1) of the 2-norm and monotonicity that |2(c)| < |c|¢. It 
follows that the functional £ can be extended by continuity to the whole space L. If 
{cn} is a Cauchy sequence tending to f in the £-norm, we define lim £(c,) = L(f); 
£ is a bounded linear functional on L: 


ret Lf) = Ife. (2) 


Next we show that the elements of L have some functionlike attributes. 
Theorem 1. Let ¢ be a Lipschitz continuous function R —> R: 
lba) — by) < klx — yl. © 68) 


Let f be any element of L; then (f) can be defined as an element of L. Further- 
more, for any pair of elements f and g of L, — 


IOF) = p (8)le < kIF — gle. (4) 


Proof. Let {cn} be a Cauchy sequence in the £-norm of continuous functions; 
denote their limit in L by f. It follows from the definition of the £-norm, the mono- 
tonicity of £, and (3), that 


lØ (cn) — b(em)le < kien — Cmle. 


Thus {¢(cn)} too is a Cauchy sequence; denote its limit in L by the symbol ¢ (f). 
Concerning the second part: let {d,} bea sequence of continuous functions tending 
in the £-norm to g. By (3), 


|b (en) — O(dn)| < klen — dy| 


POSITIVE LINEAR FUNCTIONALS 531 


holds at every point q of Q. Since £ is monotonic, 


LCE (en) — b(dn)I) < kl(len — dnl); 
by definition of the £-norm, this can be expressed as 
Ib(en) — b(dn)le < klen ~ dnle- 
As n tends to oo, the left and right sides tend to the left and rights of (4). E 
The usual rules of functional calculus hold: Here are two important examples: 
Example 1. 


0 forx <0 
x forO<x. 


b+ (x) = | 
We denote ġ+ (f) as f+, and call it the positive part of f. 
Definition 1. f in L is called nonnegative, denoted as f > 0,if f4 = f. 
Note that f in L is nonnegative iff it is the limit in the -norm of a sequence of 
nonnegative continuous functions. It follows that if f > 0, | fle = uf). Note that if 


f2>0,g>0,then f+g>0. 
We say that f < g if g — f > 0; clearly this relation is transitive in L. 


Example 2. 
a forx <a 
plx) = ix fora<x<b 
b forb<x 


for a in L we denote balf) a as fe. 


" Definition 2. f in L is said to be panded: if f= _ fb for some a and b. 


Note that f in L is bounded iff it is the limit in the -norm of a a aeaoe: of 
“boana coueai pas ineno Ra ie Se ee 

~ Note that iff in E-is bounded; and ¢-is Lipschitz continuous on the interval [a, b), 
then ¢ (f) can be defined as an element of L. 

If @(x, y) is a Lipschitz continuous function of two variables, then ¢ (f, g) can 
be defined for any pair of elements f, g in L. We will use only a special case of this. 
Suppose that f and g are elements of L that are both bounded. Then we can define 
their product fg as an element of L that is bounded, as follows: 

Let {cn} and {dn} be Cauchy sequences of bounded continuous functions tending 
in the £-norm to f and g, respectively. It is easy to show that {cyd,} is a Cauchy 


532 RIESZ-KAKUTANI REPRESENTATION THEOREM 


sequence in the £-norm; its limit in L is defined as the product fg. Note that if 
f 20,8 > 0, the fg > 0. 
The following result, although simple, is very useful. 


Theorem 2 (Monotone Convergence Theorem). Let {fn} be a monotone se- 


quence of elements in L, say increasing, fy < Sn41. Suppose that the sequence 


of numbers £( fy) is bounded; then fry converges in the £-norm to a limit f in L. 


"Proof. Since the functional £ is monotone, £( fy) is an increasing sequence of real 


numbers. Being bounded, it has a limit. We claim that { Jn} is Cauchy sequence, for, 
since fn — fm is nonnegative for n > m, 


ltr = Jule = £ fn — fn) = £(fn) ie Elfin). 


Since £( fn) is a convergent sequence, the right side tends to zero as n, m tend to co. 
Having shown that {fn} is a Cauchy sequence, it follows the completeness of L 
that fn converges to a limit in L. Oo 


Theorem 3. For any f in L 


é— limit f7 = f. 


a—-—0O 


Proof. Let {cn} be a Cauchy sequence of continuous functions whose £-limit is f. 
It follows that for any € > 0 there is an N such that 


lf — enle < e. (5) 


Since cy is a continuous function on a compact space, itis bounded: a < cy (q) < b 
for all g in Q, for any a and b exceeding the upper and lower bounds of cy. Define 
oe as in example 2; by inequality (4), Í 


ech) — bP len)le <|f- cnle <e. 


Since the values of cy lie in the interval [a, b], o? (cn) = cy, so we can rewrite the 
inequality above as 


If? —ewle < €. (6) 


Using the triangle inequality and the estimates (5) and (6), we get 


If- frle=If -enten — foles lf cylet ley —fele<2e. D 


A.2 VOLUME 


In this section we show how to use the positive functional £ to define the volume of 
any open set G in Q. 


VOLUME 533 
Definition 3. A continuous function c is called admissible for the open set G if 


(i) the support of c is contained in G. 
(ii) e(g) < 1 forall g in Q. 


The volume V (G) with respect to £ of an open set G is defined as 


V(G)= sup &(c) (7) 


c admissible 


Theorem 4. 


(i) The volume of the empty set is zero. _ 
(ti) V is a monotonic set function: if G C H, then V(G) < V(A). 
(iii) V is countably subadditive: V (UP Ga) < Lr V (Gn). 


(iv) V is countably additive, that is, if Gn is a collection of pairwise disjoint open 
sets, 


V(UGn) = J V (Gn). 


Proof. Parts (i) and (ii) are obvious. To prove (iii), take any continuous function 
c admissible for UG,. The support of c is a closed set; since Q is compact, so is the 
support of c. Being admissible, it is covered by UG); therefore by compactness it is 
covered by a finite subcollection uY Gn. We show now that c can be decomposed as 


N ‘ 
c= Xo ens (8) 
l 


where c, is admissible for Gan = 1,..., N. (iii) would follow from this; for, 
applying £ gives 


Newnes me 
ele) = È (cn); 
l 


E since by definition of volume (cn) < V(Gn), : E 


tle) < X V (Gn). 
l 


Taking the supremum of over all admissible c yields (iii). 

-- We.turn now to the decomposition (8). Each q-in the support of c belongs to at 
least one of the sets G,,i1 = 1,..., N. There exists a continuous function bq with 
the following properties: 


534 : RIESZ-KAKUTANI REPRESENTATION THEOREM 


(i) bg is nonnegative. 
(ii) bg(q) > 0. 
(iii) The support of bg lies in one of the sets Gp. 


The set where bg is positive is an open set O, which contains g; the union of these 
q 18 positi P q q 
` open sets Og contains the support of ë. Since the support of c is compact, it is covered 


1. by_a.finite. number of these open sets. The sum of the corresponding functions bg is 


-Positive on the support ofc: elke tor ee ee ssid ges 


5 bg > 0 on supp c. 


Assign to each bg, an open set-G,,-that-contains.its support:-Denote by b, the sum 
of those bg, assigned to Gy. Clearly, JON by = >= bg; now define 


bn 
D b n 
Clearly, $. cn = c, and each cp is admissible for Gn. This completes the construction 


of the decomposition (8). 
(iv) Given any N, we can choose admissible functions Cn for Gn, n < N, so that 


C, n=1,...,N. 


1 
V(Gn) — W2 < £(en). 


Since the Gy, are pairwise disjoint, a Cn is admissible for uy Gn, and therefore 


N 
£ 2 e) < V(UY Ga) 
l 


Using the linearity of £ and the inequalities above, we get 
N 1 
È VO =a VUNG. 
l 


Letting N — oo, we get 


ŽL (Gn) < VUGy). 


By countable subadditivity the opposite inequality hold; so equality must prevail. 
o 


The following simple estimate is useful: 


Theorem 5. Let h be a continuous nonnegative function defined on O, a some non- 
negative number, Ga the open set Ga = {q : h(q) > a}. Then 


L AS A SPACE OF FUNCTIONS -535 
1 è 
V (Ga) < z¢lh). (9) 


Proof. We show that any continuous function Ca admissible for Ga satisfies for 
every q, 


1 
ca(q) < =h(q). 
a 


Clearly, this is true for q not in Ga, for then ca(q) = 0, and A(q) = O. It is equally 
true for q in Ga, for then by definiton A(q)/a is greater than 1, while ca(q) < 1. 
Since £ is monotonic, l 


(ca) < Leth). 
a 


Taking the supremum of this inequality over all admissible functions Ca yields in- 
equality (9). o 


Definition 4. A subset S of Q is called negligible if it can be covered by an open set 
of arbitrarily'small volume. 


It follows from the countable subadditivity of volume [see part (iii) of theorem 4] 
that the countable union of negligible sets is negligible. 

When a relation holds for all points of q with the exception of a negligible set, we 
say it holds almost everywhere, abbreviated as a.e. 


-A3 L AS A SPACE OF FUNCTIONS 


In this section we show how to associate to any element f in L a function f(q) on 
O, up to negligible sets of points. 


Definition 5. A Cauchy sequence {cn} is called rapidly converging if 


for all n and some constant k. 


Note that every Cauchy sequence contains rapidly convergent subsequences. 
Clearly, a sequence satisfying (10) is a Cauchy sequence. 


Theorem 6. A rapidly converging Cauchy sequence {cn} of continuous functions 
converges almost everywhere, that is except for a negligible subset of Q. 


k 
len = nile SF- E E E E E EEE eee 


536 RIESZ-KAKUTANI REPRESENTATION THEOREM 


“Proof. Denote by Gn the open set of points satisfying 


Joe ween ae wae ] 
CEE Ss} an 


Combining inequality (9) for h = |c, — iail and a =1 /n® and (10), we have 


RTS Cay ane eae geme ee (12) 


It follows from (11) that for q not in Gn, |Cn (q) — ¢n41(g)| < 1/n?; therefore the 
sequence cp (q), written as a sum 


ee n 
Cn = Sie — Ce-1), 
l 


converges at every point q that belongs to only a finite number of the sets G}. The 
exceptional set of points, where {c,(q)} fails to converge, can thus be enclosed in the 
union UY Gn. By countable subadditivity, the volume of this set is bounded by 


CoO 
VUR Gn) <)> V(Gn) 
N 


——_——__ which according to (12)-is < Ea k/n? and thus tends to zero as N tends to CO. 


Thus the exceptional set can be covered by open sets of arbitrarily small volume, and 
therefore form a negligible set. o 


It is important to note that the sequence c,(g) converges uniformly outside the 
open set UP Gn, N arbitrary. 

Given any element f in L, take any Cauchy sequence {cp} of continuous functions 
that converges to f. It is easy to see we can select a subsequence that converges 
rapidly; the pointwise limit of this sequence is called a realization of dee” 

We claim that realizations obtained from any two Cauchy sequences converging 
rapidly to f are equal a.e., for, we can select a single rapidly convergent Cauchy 
sequence that contains infinitely many terms from both sequences. 

The following theorem characterizes the relation of f in L to its realization. 


Theorem 7. 


(i) For any Lipschitz continuous @, }(f)(g) = (Ff (@)). 
(ii) (F + 8)(q) = fq) + g4). 
(iit) If f and g are bounded elements of L, (f2)(q) = f(¢)g(q). 
(iv) The realization of elements of L by functions is faithful, that is, if f (q) = 
(q) a.e., then f = g in L. 
(v) Suppose that £-limit fa = f and that fy (q) converges a.e. Then the point- 
wise limit of fn (q) is a realization of the £-limit f. 


_ LAS A SPACE OF FUNCTIONS 537 


Proof. Parts (i), (ii), and (iii) are obvious. To prove (iv), it suffices to show that 
if f in L is represented by a function that is zero a.e., then f = 0 in L. We argue 
indirectly, supposing that f # 0. So we may assume that f is nonnegative. Since 
according to theorem 3, f” tends to f in the £-norm, f° £0 for b large enough. It 
follows that it suffices to consider nonnegative, bounded elements of L; we may take 
the upper bound to be 1. 

Let cn be a rapidly converging Cauchy sequence of continuous functions, each of 
which lies between 0 and 1, and tends in the £-norm to f, and to zero a.e. As noted 
at the conclusion of theorem 6, for any € > 0 there is an open set Ge of volume less 
than €, such that outside Ge the sequence c,(q) converges uniformly to f(q). Since 
F (q@) is assumed to be zero a.e. for all N large enough 


en(q) < € on the complement of Ge, (13) 
lf — enle < €. (13) 
We decompose cy as follows: 
CN = Cy — 2€u+2eu < (cy — 2€)4 + 2eu, (14) 
where uw is the unit function on Q. Since £ is monotonic, we deduce from (14) that 
L(en) < Ellen — 2eut) 4) + 2eL(u). (15) 


It follows from (13) and the fact that cy(q) < 1 for all q, that the function (cy — 
2eu)+ is admissible for Ge. Therefore 


(enw — 2eu)+) < V(Ge). 
Setting this into (15), and using the fact that V (Ge) < €, we obtain 
(cy) < (1 + 2€(u))e. 
Using this inequality and (13’) combined with (2), we obtain 
ef) = ef — ew) + lew) < Lf — enle + (1+ 2€u))e < 20 + LCw))e. 


“~Since fis 4 ionnegative elementof E; €(fy= |flteysinceeis‘arbitrary, we conclude 
that | fle = 0. This contradicts the presumption that f # 0 in L, and completes the 
demonstration of (iv). 

To prove (v), select from {fa} a rapidly converging subsequence, also denoted as 


~~ [Jah in the sense of inequality (10). For each n choose a coiitifitious function ey so T ~ 


spat e 


(a) Ifa — Cnle < l/nł, 
(b) | falqa) — cnl) < 1/n?, except on an open set of volume < 1/n?. 


It follows from (a) and-(10) for { fa} that {cn } converges rapidly in the €-norm, and 
that its -limit is f. Therefore c,(q) tends to f(q) a.e. On the other hand, it follows 
from (b) that lim cy (q) = lim fa(q) a.e. ql 


538 RIESZ-KAKUTANI REPRESENTATION THEOREM 


~ Theorem 8. Let f be an element of L whose realization f (q) is > 0 ae. Then f is 
20 asan element of L. 


"Proof. According to pari G) of theorem 7, AO) = Fa. If f(g) > Oae, 


it follows that f4 (q) = f(q) a.e. It follows then from part (iv) of theorem 7 that 


f+ = f. but this is the definition of f > 0. one ae ee 


“Corollary. Two elements f and g in L satisfy f < g iff f(g) <g(qae. 


Proof. The relation f < g means that g = f + p, p nonnegative. Now apply 
theorem 8 to p. O 


A.4 MEASURABLE SETS AND MEASURE 


Definition 6. A set S in Q is measurable if its characteristic function c Ss: 


_ |i whenginS 
esta) = 0 wheng notin S (16) 
is the realization of some element fs in L. The measure of S is defined to be 
z l m(S) = £(fs). i (17) 


Note that the notion of measurability, and the value of measure, depends on the 
linear functional £. 
We show first that this notion is not vacuous. 


Theorem 9. Every open set is measurable, and its measure is its volume. 


Proof. We will use.the function d(q, G°), the distance of q from the complement 


of G. Set 
0 for x < 1/2n 
én(x) = { linear in between 
l for 1/n < x. 


Define the continuous functions c, (4) = on (d (q, G°)). Clearly, cy is an increasing 
sequence of functions, and equally clearly, £(c,) is bounded by £(u). Therefore, by 
theorem 2, the monotone convergence theorem, the sequence {cy} converges in the 
£-norm to a limit which we denote as JG. On the other hand, {c (q)} converges for 
every q to the characteristic function cg (q) of G. According to part (v) of theorem 7, 
eg (q) is the realization of fg; this proves that G is measurable. 

To determine the measure of G, we note that since cy is £-convergent to fG, £(cn) 
tends to £( fg). Since each cy is admissible for G, £(cn) < V(G), and so (fg) = 
lim £(c,) < V(G). On the other hand, given any admissible c, we can choose n so 


Theorem II. A realization J (q) of any f in L is a measurable functio 


MEASURABLE SETS AND MEASURE 539 


large that c < cy. Since V(G) = sup £(c), it follows that £(/G) = lim (cy) > 
V(G). Combining the two inequalities shows that £( fg) = V(G). o 


Theorem 10. The collection of sets measurable in the sense of (16) is a o -algebra, 
and the set function m(S) defined by (17) is a measure. That is, 


(i) the complement of a measurable set is measurable. 

(ii) the intersection of two measurable sets is measurable. 
(iii) the denumerable union of measurable sets is measurable. 
(iv) m(S) is a countably additive set function. 


Proof. (i) S is measurable if its characteristic function cg is the realization of an 
element of fs in L. But then the realization of u — fs is 1 — cg, the characteristic 
function of the complement of S. 

(ii) If fs, and fs, are elements in L whose realizations are the characteristic func- 
tions of Sı and S2, then the realization of the product fs, fs, is the characteristic 
function of S1 N So. 

(iii) Let {Sn} be a denumerable collection of measurable sets. We can replace 
this collection by another one, {T4}, where Th = Sn N (S1 U... U S,_1)°. Clearly, 
TiU... U Ta = S1 U... U Sy, and therefore UT, = US,. The sets Tp are pairwise 
disjoint. Consider now the series )-{° fr,; since each fr, is nonnegative, the partial 
sums eH fr, of this séries are increasing. Furthermore, since the Tp are pairwise 
disjoint, DN fr, < u; therefore eo fr,) < €(u). We appeal now to the monotone 
convergence theorem to conclude that the partial sums converge in the @-norm to a 
limit that we denote by fr: 


co 
>) ft, = fr (18) 
1 


On the other hand, the realization of the partial sums is an cr, (q), which converge 
pointwise to cy (q), where T = UT,,. By part (v) of theorem 7, cr is the realization 
of fr; therefore T = UT, = US, is a measurable set. 


(iv) Since the partial sums of (18) converges in the -nori to f: T, We Can apply ~ 


£ to obtain LTL fr,) = £( fr). Using the definition (17) on both sides, we obtain 
Yt m(Tn) = m(T). l gO 


sets Ka: ~~ i 


Ka = {q4 : f(a) Zza} ` is (19) 


a any real number, are measurable sets. 
Proof. We may assume that a is positive, for we can always add a constant to f. 
We may further assume that a = 1, for we can always divide f by a. Recall that fa 


n, that is, the i 


540 RIESZ-KAKUTANI REPRESENTATION THEOREM 


denotes the truncation of f from below by 0, and from above by 1. The sequence 
fae = i is decreasing, since fy — fayi = ( f ya- i is the product of two 


nonnegative elements of L. Since each fn is nonnegative, all the 2(f,) are bounded 


- from below. by zero:-So-by-the-monetone-convergence-theorem;-the- “sequence-f_-—- 
. converges in the £-norm to a limit. On the other hand, the sequence of realizations 


_ iQ) = = (fo a 


converges to 1 at all points g where f (q) > 1, and-+to 0 at all points-where::f (q) < 1.---— 
By the part (v) of theorem 7, this shows that the characteristic function of the set K 1 
defined by (19) is a realization of the £-limit of Gh yh O 


Replacing f by —f, and taking complements; -wecan“show that“all-sets“of the’ - 
form 


(a: fa) <a} (a: f@ <a), tq: fa) >a) 


are measurable as well. 
We are ready to show that the measure m we have constructed in (17) yields the 
linear functional £, 


£(e) = f cdm (20) 


for all continuous functions c. To see this, we form the sets K je 
Kje = {q : je Sc(g) < (G + Ne}. (21) 
Denote the characteristic function of K j e by kj e- It follows from (21) that 


2d (@) Sela) < LG + 1)ekj,e(q) (22) 


holds at each point q. Since | c is bounded, these sums are finite. Aeconina to the 
corollary to theorem 8, we deduce from (22) that 


Y jekje <c < SG + Iekje (23) 
holds in L. Since £ is a positive functional, we deduce that 
DO feline) < te) < YOU + 1ellkje). 
Using the definition (17) of measure for the sets K j e, We can rewrite this as 
J jemlkje) < £0) < YOG + 1emlkj,e). (24) 


The left and right side in (24) are lower and upper sums for the integral f cdm in 
(20). Their difference, 


THE LEBESGUE MEASURE AND INTEGRAL 541 


€ XO m(kje) = em(Ukj e) = em(Q), 


tends to zero as € does. Thus (24) shows that £(c) is the only number > all lower 
sums and < all upper sums. This proves the representation formula (20). c] 


We remark that the representation formula (20) holds as well for all f in L. In this 
case the sums in (23) may be infinite. Taking f to be nonnegative, the convergence 
of the infinite series in (23) follows from the monotone convergence theorem. Since 
any f in L can be written as the difference of two nonnegative elements, the general 
case follows. 

We close this section with the converse of theorem 11. 


Theorem 12. Every function g(q) defined on Q that is measurable and integrable 
is the realizaiton of an element g of L. 


Proof. Measurability means that the sets Ha = {q : g(q) < a} are measurable. 
Define the function 7g, R — R, by 


g(a) = m( Ha). 


The function g is called integrable if 


J talang < o. 
Define ge in L by 


&e = X jekj, 


where kj e is defined after equation (21). As e tends to zero, the realizations ge(q) 
tend to g(q). It is‘easy to show that if we set € = 27"; we get a Cauchy sequence in 
the €-norm, whose limit g is realized by g(q). 


The uniqueness of the representing measure is a standard fact of measure theory. 


A.5 THE LEBESGUE MEASURE AND INTEGRAL 


If we take Q to be a Euclidean multitorus, and £ to be the Riemann integral, our con- 


struction gives the Lebesgue measure and integration. I consider this approach more 7" 


- natural-than the traditional ones; for the most important object in the Lebesgue the- 
ory is the complete space L! (as well as the spaces LP). In the traditional approaches 
the completeness of L! is the last item to emerge; in the present approach it is the 


first. 


THEORY OF DISTRIBUTIONS 


In his formulation of quantum mechanics Dirac treated the continuous spectrum by 
employing a function on R, denoted as 5—and ever since called Dirac’s delta func- 
tion—that is zero everywhere except at x = 0, where it is so large that the integral 
of 6 over R equals 1. Of course there is no such function. Von Neumann, in his book 
on quantum mechanics (1932), warns against basing a theory on such a fiction; he 
knew very well how to treat the continuous spectrum rigorously. 

The 5-function can be given new life as a generalized function. The need for such 
generalized functions was keenly felt in the 1920s and 1930s. Bochner introduced— 
rigorously—such notions in the context of the Fourier transform, and Sobolev in the 
context of partial differential equation. Hadamard’s use of the “finite part” of an in- 
tegral in his formula for solutions of hyperbolic equations foreshadowed the need 
for generalized functions, as did Wiener’s justification of the Heavyside calculus. 
L. C. Young’s idea of “generalized curves” is a step in a similar direction. But it was 
Laurent Schwartz, in the 1940s, who came up with the notion of distributions that 
was general enough and supple enough to serve most purposes of both the theory of 
partial differential equations and harmonic analysis. Harald Bohr, a leading mathe- 
matician of his time (and brother of Niels Bohr), was among the first to recognize the 
value of Schwartz’s ideas. The world soon followed; at the International Congress of 
Mathematician in 1950 Laurent Schwartz was honored by a Fields Medal; pockets 

_ of resistance eventually faded. 

This appendix presents the bare bones of the theory of distributions, with a few 
scraps of meat thrown in. For a fuller presentation I recommend Robert Strichartz’s 
book on the subject. l 


B.1 DEFINITIONS AND EXAMPLES 


We denote by C§° the space of all complex-valued infinitely differentiable functions 
in R” of compact support. We say that a sequence {ug} of C5° functions converges 

T to u if the support of all ug is contained’in the same compact set K, and if for each 
multi-index a = a@),...,@,, D% ug = Dy ... Dì" Uk converges uniformly to D@u, 
where D; denotes partial differentiation with respect to xi. 


544 f THEORY OF DISTRIBUTIONS 


Definition. `A distribution is an element in the dual of C97. that is, a complex-valued 
linear functional £ on C§° that is continuous under sequential convergence as defined 
above: 


Lup) > Llu) if up > u. 


-This kind of continuity is equivalent to the following apparently more stringent 


6 eR ONES tees Nes ORe at sb net : RS 5 


Theorem 1. Given a distribution £, to each compact set K there is a positive integer 
N(K) and a positive number c(K ) such that for all Co K functions uw hose support 
is Contained in K oe 


Eu) < cluln, where |u|y = er |D*u (x)|, lal = 2 aj (1) 


Proof. Given K, suppose that (1) is false for all c and all N; that means that for 
each N there is a Ce function u,, with support in K such that £(un) = 1, |un|n < 
1/N. Clearly, un —> 0 in the sense defined above; therefore £(u,,) ought to tend to 
£(0) = 0, which contradicts £(uy) = 1. 


We call |u| jy the C norm. We embed Cr in the space of distributions by assign- 
ing to each v in C>° the linear functional 


£(u) = fw dx = (u, v). (2) 


Clearly, (2) defines a continuous linear functional, and clearly different v’s define 
different functionals. Thus C9? functions are special kinds of distributions. 

By the same token, distributions may be regarded as generalized functions. In 
what follows we will often write. 

£(u) = (u, £). 

In (2) we have examples of distributions; we now give others. In each case we leave 
it to the reader to verify that the given functional depends continuously on u in the 
sense of sequential convergence. 
Example 1. £(u) = a uv dx, v any continuous function. 


Example 2. 5(u) = u(0), the Dirac delta “function.” 


Example 3. £(u) = f(D%u)v dx, v any integrable function, œ any multi-index. 


Example 4. pa Co’ function in R! whose zeros are simple: if p(y) = 0, p’(y) Æ 0. 


OPERATIONS ON DISTRIBUTIONS : 545 


Define £(u) as the principal value integral 


eu) = py | nas = im f aA 
P &0J|p(x)|>e P(X) 


Later we will show that the most general distribution can be built up from distribu- 
tions of the form given in example 3. 


Definition. Let D be an open set in R”; denote by C9 the space of all C°° functions 
with support in D. A distribution in D is a linear function on C9 continuous under 
sequential convergence as defined above. 


B.2 OPERATIONS ON DISTRIBUTIONS 


In this section we show how certain operations that can be carried out for ordinary 
functions can also be carried out for distributions. These operations T are the follow- 
ing kind: 


(i) T is a linear operator that maps C5° continuously into CS’. 


(ii) T has a transpose T’ that also carries C&° continuously into C9. T’ being the 
P 0 y 0 © 
transpose of T means that for all u, v in C “ae 


(T'u, v) = (u, Tv), (3) 
where (, ) is the symmetric bilinear functional used in (2) to embed Coo in 


its dual. Note that the transpose T’ is uniquely determined by (3) and that the 
transpose of T’ is. T itself. s 


The following rules dealing with transposes are obvious and useful: 


(i) If T and S have transposes, so do aT + bS, and (aT + bSY = aT’ + bS’. 
Gi) The transpose of TS is (TSY = S'T'. 
i Theorem 2 (Extension Theorem). Ler T, T be continuous linear operators map- _ 
i ping Cg into co transposes of each other. Let (be any distribution; we define Te ~ ~~ 
...as.the distributions given by 


(Tv, 0 = (w, TO. (4) 


“Proof. The reader can easily verify that the left side of (4) depends linearly and 
continuously on v, since continuity of T’ is taken to mean that T’ maps every con- 
vergent sequence of C5° functions into a convergent sequence of C§° functions. O 


546 THEORY OF DISTRIBUTIONS 


Exercise I. Show that if T and S as mappings of C9? into C>° commute, then they 
also commute as mappings of distributions into distributions. 


Note that if £ happens to be in C§°, the definition of TZ by (4) is the same as the 


_, Original. We now give some interesting examples of linear operators with transposes. 


~~ Example 5. Let t be a C function; the operation T of multiplying by t clearly 


“maps Cj linearly ‘and continuously into‘C&°; equally clearly, T is its own transpose. ~ 


Example 6. T = D;, meaning differentiation with respect to x;. Clearly, T maps 
C97 into itself linearly and continuously, and T’ = —T. 


Example 7. (Tqu)(x) = u(x — a), meaning that Ta is translation by a. Clearly, Ta 
maps C° into itself linearly and continuously, and T} is translation by —a. 


Example 8. (Ru)(x) = u(—x). Clearly, R’ = R. 


Example 9. Let t be a continuous function with compact support; define Tu as the 
convolution of u with ż, that is, 


Tu = (t *u)(x) = frome — y)dy. (5) 


Clearly, T maps C9? into itself linearly and continuously; T’ is convolution with Rr. 
It follows from the extension theorem and example 9 that the convolution of any 
distribution £ and any C9? function u is well defined as a distribution. We claim now 
that £ x u is a C° function. To see this, we note that the classical formula (5) makes 
sense for distributions 7, provided that the integral on the right is interpreted as in (2), 


(€ x u)(x) = (ux, £), (6) 


where wy denotes the function u(y — x). Clearly, ux, as an element of C§°, depends 
continuously and differentiably on y, from which it follows that (ux, £) is a C% 
function of x. 


Example 10. Let ġ: R” —> R” be a C° mapping that takes compact subsets of R” 
into compact subsets; suppose that @ is invertible and that its inverse y has the same 
properties. Then the transformation T defined by (Tu)(x) = u(@(x)) maps Coe 
linearly and continuously into cy . The transpose of T is (T’v)(y) = v(v(y)) JQ), 
where J is the Jacobian of y; T’ maps CẸ into itself, linearly and continuously. 


Applying the extension theorem to these examples we can give meaning to the 
following operations with distributions: 


(i) The product of a C° function and a distribution is a distribution. 
(ii) The derivative to any order of a distribution is a distribution. 


LOCAL PROPERTIES OF DISTRIBUTIONS 547 


(iii) The translate of a distribution is a distribution. 
(iv) The convolution of a distribution with a C>° function is a C° function. 


(v) The composition of a distribution with an invertible C°° mapping is a distri- 
bution. 


On the other hand, the product of two distributions, or the composite of a distri- 
bution with a noninvertible C°° mapping, cannot be defined in general. In particular, 
there is no reasonable way of defining 5*(x) or 6(x*). But there is a way of defining 
the product of distributions of disjoint variables: 


Exercise 2. Show that if £; and £ĉ are distributions in the disjoint variables 
X1,-+-,X%m and yj,..., yn respectively, then their product £;22 can be defined 
as a distribution in R”*”. In particular, prove that 


(x1) (x2)... (Xn) = (X1, ..., Xn). 


Exercise 3. Find the first derivative of these distributions in R!: 


(a) £(u) = f u(x)|x| dx. 
(b) £(u) = PV f u(x)/xdx. 


(c) Lu) = fog u(x) dx. 
(d) E(u) = d(u) = u(0). 


B.3 LOCAL PROPERTIES OF DISTRIBUTIONS 


We recall that the support of a continuous function f is the closure of the set of points 
where f(x) 4 0. We will define the analogous concept for distributions; our starting 
point is the following equivalent characterization of the support of a function: 

The complement of the support of f is the largest open set on which f is zero. 
We will give meaning to the concept of a distribution being zero on an open set: _ 


Definition. A distribution £ is zero on an open set G if £(u) = 0 for all C§° functions 
u whose support is contained in G. 


Lemma 3.. If a distribution £ is zero on two open sets Gi and Go, then £ is zero on. 


the union of G; and Go. 

Proof. We have to show that £(u) = O for all u whose support lies in G4 U Go. 
We will accomplish this by decomposing u as u = u] + u2 where support of u} is 
contained in Gy, that.of u2 in G2. We proceed as follows: Since each point x in the . 
support of u belongs to at least one of the sets G] or G2, we construct a function hy 
with the following properties: 


D 


548 THEORY OF DISTRIBUTIONS 


G) -hx (¥)-= 0 for all y. 
(ii) hy is C”. 
(ili) hy (x) > 0. 
‘(iv) The support of hx lies in Gy or G2. 


~ The`set where=hy >~Ovis an open set containing x; the union of these cover the 


. Support of w. By compactness, a finite number of them cover the support of u. 


Denote by h4 the sum.of those, of this finite collection whose support lies in Gy, 
by ho those whose support lies in G2. Their sum h; + ho is positive on the support 
of u. Now set 
ee 5 aay ae a a 
uy = ü, uy = 


Chtih Z hy tho 


wo 


Clearly, support uy} C G1, support u2 C Go, uy and uz both C™, and uj + u2 = u. 
Then 


(u, £) = (uy + u2, £) = (u1, &) + (u2, £) =0 +0, 


since £ is assumed to be zero both in G; and Go. This completes the proof of the 
lemma. 0 


It follows by finite induction from the lemma that if £ is zero on the open sets 
Gi, G2, ..., Gn, then w15 ZrO on their union. We claim that the same conclusion 
holds for an infinite collection of open sets G j. We have to show that if the support of 
u is contained in U G j, the (u) = 0. By compactness, the support of u is contained 
in the union of finitely many G j: SO the conclusion follows. 

The union of all open sets on which a given distribution £ is zero is the largest 
open set on which £ is zero. The complement of this set is defined to be the support 
of £. 

The following result follows from the definition of support: 


Theorem 4. Let £ be a distribution, u a CS function, and suppose that the supports 
ofu and £ are disjoint; then £(u) = 0. 


Exercise 4. Show that if £ is a distribution with compact support, and w a C§° func- 
tion, then £ *« w is a C8? function. 


Exercise 5. Show that if £ and m are two distributions and one of them has compact 
support, then £ * m can be defined as a distribution. 


Exercise 6. Let f be a C% function, £ a distribution such that f£ = 0. Show that £ 
is zero on the open set where f(x) 4 0. 


Exercise 7, Show that the support of the derivative of a distribution £ is contained in 
the support of £. 


LOCAL PROPERTIES OF DISTRIBUTIONS 549 


The support of the Dirac ô distribution is the single point x = 0. Therefore by 
exercise 7 the support of all derivatives of 5 consists of x = 0. Conversely: 


Theorem 5. Every distribution l whose support consists of a single point, say x = 0, 
is of the form £ = Piet <y CaD%6, N some positive integer, cy complex numbers. 


Proof. We will use the following lemma: 


Lemma 6. Let £ be a distribution whose support consists of the origin. Then there 
is an integer N such that £(u) = 0 for all Coe functions u which, together with all 
their derivatives up to order N, are zero at the origin. 


Proof. Let f bea C° function with these properties: 


_ JO forjx| <1 
Ta for 2 < |x|. 


Denote by v the function (1 — f)u; then since fu = 0 for |x| < 1, by theorem 4, 
£( fu) = 0. Therefore 


£(v) = (u) — £( fu) = (uw). 


This shows that it suffices to look at functions u whose support is contained in, say, 
the ball |x| < 3. According to theorem 1, for such functions £ is continuous in some 
CN’ -norm. 

Take any positive real number k, define fk as fy(x) = f (kx), and set up = fhu. 
We will show that as k —> co, ug tends to u in the C’-norm. It follows from the 
definition of f that f,(x) = 1 for |x| > 2/k; therefore ug(x) = u(x) for |x| > 2/k. 
We want to estimate u, and its derivatives for |x| < 2/k. Since u and all its deriva- 
tives of order < N are zero at x = 0, for |B| < N, 


[DF u(x)| = Oq ¥ +12; (7) 
therefore for |x| < 2/k, 
|DFu(x)| < O(KlBI-N— 1), (8) 
By calculus 
Dug = D? e ROS K alae 


Since fy(x) = f (kx), |DY fel = 0(k'”!); combining this with (8) and (9), we get _ 
|D ug (x)| = O(KII-N—!) for |x] < 2/k Since uz (x) = u(x) for |x| > 2/k, this 
completes the proof that ug —> u in the C™-norm. -According to theorem-1, it follows 
that £ (ug) tends to £(u). Since fg, and therefore ug, is zero in a ball around the origin, 
it follows from theorem 4 that £(u,) = 0. But then so is £(u) = lim (ug). o 


ae) 


550 THEORY OF DISTRIBUTIONS 


Let uw, and u2 be two Cee functions whose values, and the values of all their 


lt) Lu): In-other words, (1)-only-depends onthe value of u and its derivatives 
up to order N at x = 0. It follows that £ must be of the form 


= a 


aeaa Na Gea eS OE” SO jaj <N weiner eR Lt + 


The conclusion of theorem 5 is just a restatement of this relation. “i 


Theorem 7. Every distribution £ of compact support can be written in the form 


diamen A s l= 2 D” ga, 


laj £L 


where the gq are continuous functions, and L some whole number. 


Proof. Leth bea cy function that equals 1 on an open set containing the support 
of £. It follows from theorem 4 that for all u € C9°, £(u) = (hu). 

Since every function of the form of hu is zero outside the support of A, it fol- 
lows from theorem 1 that there is a positive constant and an integer N such that 
[€(hu)| < const. |hu|y. Clearly, |ku|y < const. july, the constant depending only 


[£(u)| < const. july (10) 


for all u, with a constant and N independent of u. 
We introduce now the norms 


A 1/2 . 
luiu = | X f weuPax a1) 


lasM“ 


and denote by Hm the completion of C9° under this norm. Since this norm comes 
from a scalar product that we denote as ( , )y, Hm is a Hilbert space. It follows from 
the Riesz-Frechet theorem that every continuous linear functional in Hy is of the 
form 


(u, 8): g € Hy. (12) 


Hy is the Sobolev space W™:? introduced in chapter 5. 
According to an important inequality due to Sobolev (see Adams’s book cited in 
chapter 5), for some constant depending only on the size of the support of u, 


luly < const. llulla for N < M— =. (13) 


It follows from (13) that every Cauchy sequence {ux} in the Hyg-norm of C5° 
functions is also a Cauchy sequence in the C%-norm. So such a sequence {ux} tends 


LOCAL PROPERTIES OF DISTRIBUTIONS 551 


to a C% function u. This mapping of Hy into C™ is one-to-one, so that this is an 
embedding of Hy in C™. 

Combining (10) and (13), we see that |£(u)| < const. |jullẹ, which implies that 
(u) is a bounded linear function in Hm. Therefore it can be represented in the 
form (12): 


E(u) = (u, Yu = me (D*u, D“ 8). 
la| <M 


The right side can be rewritten as 
E(u) = Ko) u, D g) = (u, Jen Dg) , 


where D? g denotes a distribution derivative of g. Thus 


ta (Delp. (14) 
Since g belongs to Hm, g isa C™ function; thus (14) is the representation of £ stated 
in theorem 7, with L = 2M — N. i g 


The next result shows that just like functions, distributions are determined up to a . 
constant by their first derivatives. More precisely, we have 


Theorem 8. Let G be an open, connected subset of R", £ a distribution in R” such 
that all first partial derivatives Dj;€ of £ are zero in G. Then £ = const. in G. 


Proof. £ = const. in G means that there is a constant c such that for all u whose 
support lies in G, 


lelu) = cfu dx. (15) 
We will use the following lemma. 


Lemma 9. Let b(x) denote any co function whose support lies in the unit ball 
jx] < 1, and whose integral f bdx = 1. Let k be any positive number, and define by 
as by(x) = k"b(kx). Then for any distribution £, lg = by * £ tends to £ in the sense 
that for any Cy function, €;(u) tends to €(u) as k tends to œ. 


Proof. There is no harm in supposing that b is symmetric: b(—x) = b(x); then so.. 


is bg, and so convolution with bg, acting on ig functions, is its own transpose. By 
definition of by * £ based on the extension theorem, for any CG? i, 7 TTT 7 


llu) = (bg * £)(u) = (by * £, u) = (£, bg * u). (16) 


It is not hard to prove (see chapter 11, section 11.1) that the sequence of-C§°-func-- 
tions by * u converges to u. Then (bp * u) converges to £(u), and so the lemma 
follows from (16). 7 


5520 THEORY OF DISTRIBUTIONS 


For Ge functions, convolution and differentiation commute. Therefore, according 
to exercise 1, they commute also for distributions: 


Dj(b*m) = (Djb) *m 


_ for any C9? function b and-any distribution m. 


‘Exercise 8. Show that 


Dj(b*m)=b* Djm 


for every C§° function b and every distribution m. 


Lemma 10. Let b denote a C? function whose support lies in a ball of radius r 
around the origin. Let m be any distribution; then every point x in the support of 
bm has distance < r from some point of the support of m. 


Exercise 9. Prove lemma 10. 


Define b; as in lemma 9. It follows from exercise 8 that 
Dj lk = Dj (by * £) = by * Dj£. 


The support of b; is.confined to a ball of radius 1/ around the origin, and according 
to the assumption in theorem 8, the support of D;£ is contained in the complement 
of G. Therefore it follows from lemma 10 that D;£, is zero in Gx, the set of all 
points of G whose distance from the boundary of G is greater than 1/k. 

Let u be any C§° function whose support S is contained in G. It is not hard to 
show that since G is connected, there exists a positive d such that any two points of 
S can be connected by a polygonal path P in G whose distance from the boundary 
of G is > d. For k larger than 1/d, such a path P lies in Gz, and therefore all partial 

derivatives Dj of £,.are.zero along P. It follows that £; is a.constant along. P, the 
same constant for all points of S. Call this constant c; then by (2) 


lu) = [coward scx fu dx. 


According to lemma 9, the left side tends to £(u) as k —> co; therefore the right side 
tends to a limit. For f udx +Æ 0 it follows that cy tends to a limit c; it is easy to show 
that c is independent of u, as asserted in (15). QO 


Theorem 11. Suppose that g is a continuous function whose derivative in the sense 
of distributions Djg is a continuous function. Then Dj;g is the derivative of g not 
only in the sense of distributions but in the classical sense. 


Proof. According to lemma 9, g is the limit in the sense of distributions of the 
sequence of C§° functions g = by * g. Since differentiation commutes with convo- 
lution, Dj gx = by * Djg. Since both g and D jg are continuous functions, according 


LOCAL PROPERTIES OF DISTRIBUTIONS 353 


to chapter 11, section 11.1, g —> g and Djgg — Djg uniformly on compact sub- 
sets. According to calculus 


b 
gk(b) — gx (a) =| Djgkdxj. 


a 
Letting k tend to co, we obtain 
b 
g(b) — g(a) = i Djgdxj; 


a 


from this integral relation it follows that Djg is the derivative of g in the classical 


sense. g 


Definition. A distribution £ is called positive if £(w) is nonnegative for every non- 
negative C9? function. 


Examples of positive distributions abound: 
Example 11. £ a nonnegative continuous function. 
Example 12. £ = 6(x — a). 

Example 13. €(u) = f udm, m a measure. 


Lemma 12. Let £ be a positive distribution. To each compact set K in R" there is a 
constant c such that |@(u)| < clulmax for every cy function u supported in K. 


Proof. Denote by p a nonnegative C§° function that is equal to 1 on K. Since u 
is supported in K, for every x, 


u(x) < |ulmax P(x). 


Since £ is a positive distribution, £(u) < |ulmax€(p). Similarly —u(x) < |u|maxp(x), 
so —£(u) < |u|max@(p). This gives the inequality in lemma 12, with c = £(p). 


Using inequality |€(u)| < clu|max, we can extend, by continuity, the linear func- 


„tonal € to all continuous functions u with compact support; the extended functional 


remains positive. 
According to the Riesz-Kakutani representation theorem (see Appendix A) a pos- 


itive linear functional on the space Co of continuous functions with compact support ~- 


can be represented as an integral with respect to a measure. This proves 


Theorem 13. Every positive distribution is a measure: 


£(u) = fe dm. 


554 E THEORY OF DISTRIBUTIONS 
B.4 APPLICATIONS TO PARTIAL DIFFERENTIAL EQUATIONS 


In chapter 9 we constructed the regular part g of Green’s function for smoothly 
~ bounded domains in the plane. Given any point g in D, g = g(p; q) is a harmonic 
function of p whose value on the boundary of D equals log |p — q|. The difference 
~ logf{p — q| = g(prq) is Green’s function G(p; q). We want to show that Green’s 
~ function satisfies, in the sense of distributions, the equation 


~ AG=2nd(p-q), 


where A is the Laplace operator A = D2 + DŽ, x and y the Cartesian coordinates 
of p. Since the regular part of Green’s function satisfies Ag = 0, it suffices to show _ 
that 


A log yx? + y? = 278(x, y); (17) 


here we have chosen q = 0. 


Proof. According to the definition of derivatives of distributions, (17) means that 
for all C9? function u, 


[oe |p| Au dx dy = 2u(0). (17) 
To see this, we write the integral on the left as the limit as e —> 0 of integrals over 
the exterior of circular disks of radius ¢ around the origin: 


f log |p]^u dx dy. ; (18) 
Jiplze 
Integration by parts changes (18) into 


f (Alog lpDudxdy + f log plan ds = | (8, log|p|juds. (18’) 
|plze |pl=e |pl=e 

Here ðn is the outward normal derivative on the boundary of the domain |p| > è; 
that is, a, = —d/dr on the circle |p| = £, where r = |p]. 

Since log |p| is a regular harmonic function for p Æ 0, A log |p| = 0 for |p| > e, 
so the double integral in (18) is zero. The first line integral in (18) is less in absolute 
value than const. 27e| log €|, where the constant is an upper bound for |d,u|. So as € 
tends to zero, this term tends to zero. In the second line integral 


d 1 
ðn log |p| = Er logr = a 


On the circle |p| = e this equals —1/e. The value of u at every point of the circle 
|p| = £ equals u (0) + O(e). It follows that as e tends to zero, the last term in (18y 
tends to 2x u (0); this proves formula (17). go 


APPLICATIONS TO PARTIAL DIFFERENTIAL EQUATIONS 555 


One of the uses of distributions in the theory of partial differential equations is 
first to prove the existence of a distribution solution, and then to show that this object 
is actually a C®° function. Below we present an example in a simple case of how the 
second step is carried out; it is an extension of a classical result of Weyl. 


Theorem 14. Let £ be a distribution in an open set D in Ry, which satisfies there 
Laplace’s equation A£ = 0. Then € is a C° harmonic function in D. 


Proof. The proof is based on 


Lemma 15. Let f denote a spherically symmetric function in R", n > 2, namely 
f(x) = g(|x|), and suppose that f (x) = 0 for |x| = R. We further require that 


j f(x)dx =0, J jx?" f(x) dx =0. (19) 


Then there exists a spherically symmetric C™ function h, h(x) = 0 for |x| > R, that 
satisfies 


Ah = f. (20) 


Proof. Write h, to be determined, as h(x) = p(|x|). In terms of the polar coordi- 
nate r we can write (20) as 


=j 
Ah = p" + —p' = g(r), (20) 


where ’ denotes differentiation with respect to r. Multiplying (20) by r?—! gives an 


equation that can be written as 
(r?—! ply! = g(r)r"7!. 


Integrating this equation gives 


r l 
r”! p (r) =f s! g(s) ds. ee DOl Laa 


We claim that the right side is zero for r > R. Indeed, g(s) = f(|x|), so we can 
rewrite it as 


According to (19) this is zero for r > R, since the suppòrt of f is contained in the 
ball of radius R around the origin. 
Divide (21) by r”~! and integrate: 


prot i i 
p= f rion leleda (22) 


DORO | fo de eae e ae Re ee e 


we 


556 THEORY OF DISTRIBUTIONS 


We claim that p(r) is zero whein’r > R. To see this, integrate (22) by parts. Since 
p'(r) = 0 forr > R, there are no boundary terms, so we get, since n > 2, 


We can rewrite this integralas..  .... 


Og ee a aha ae ae “PO” f aydä A 


|x| <R 


which is assumed in (19) to be zero because the support of f is contained in |x| < R. 
Next we show that / is a C°°-function;first-we. claim that-p-is, which is not hard- - 
to deduce from formula (21). ` 
Setting x = (r, 0, . .. , 0) into the relation f(x) = g (|x|), we get that 


f@,0,...,0) = alri): 


This shows that g(r) can be extended as an even C° function to all real r. It follows 
from (20) that p(r) too can be extended as an even C™ to all real r; therefore all 
derivatives of p of odd order are zero at r = 0. From this it is not hard to deduce that 
pír) = q(r*), q another C™ function. Then A(x) = p(|x|) = q(\x|2) is also C®, 
as claimed in the lemma. g 


Exercise 10. State and prove lemma 15 for n = 2. 


We return to theorem 14. We will, as in lemma 9, approximate £ by a sequence of 
C® functions. Take n > 2 and let b denote a spherically symmetric C% function; 
supported in |x| < 1, satisfying the two conditions: 


fewa =], [iP dx =0. (23) 


We define, as in lemma 9, the functions by as bg (x) = k"b(kx). Clearly, these func- 
tions, too, satisfy conditions (23): 


| bp (x) dx = 1, Í lx" bp (x) = 0. (23’) 


The support of bx is confined to a ball of radius 1/k around the origin. The convo- 
lution £% = bg * £ can be defined as a C% function in the domain D4 defined as the 
set of all points of D whose distance from the boundary of D is greater than 1/k. By 
formula (6), for y in Dx, 


Lr (y) = (bk ys £), 
where bk, y = by (y — x). Compare two of these functions: 


Lk (y) — Em QO) = (bk,y = bm, y- £). (24) 


APPLICATIONS TO PARTIAL DIFFERENTIAL EQUATIONS 557 


The difference bg, y —bm,y is spherically symmetric about the point y, and its support 
is contained in a ball of radius R around the point y, where R = max(1/k,1/m). 
It follows from (23’) that each of the functions bg, y — 5m,y satisfies conditions (19). 
Therefore according to lemma 15, it is the Laplacean of a C™ function h, spherically 
symmetric about the point y, whose support is contained in the ball of radius R 
around y, satisfying Ah = by, y — bm,y. So we can rewrite (24) as 


£x(y) — lm (y) = (AA, £). 


According to theorem 2 we can rewrite the right side as (h, AZ), which is zero since 
A£ = 0. So we conclude that at all points y in D, €g(y) = €m(y) when k and m 
are greater than |/d, d the distance of y to the boundary of D. Therefore in every 
compact subset of D, £4 does not depend on k for k large enough. According to 
lemma 9, the C° functions £% converge in the sense of convergence of distributions 
to Z. This proves that £ is C™ in D, as claimed in theorem 14. im 


Theorem 14 is true for solutions of any elliptic partial differential equation with 
C” coefficients. 

Solutions of hyperbolic partial differential equations are quite different in regard 
of differentiability. Again we take the simplest case, the wave equation in one space 
dimension: i 


Upp — Uxx = 0. (25) 
Every function of form 
u(x,t) = fœ +t) +ga t), (26) 


where f and g are twice differentiable functions, is a solution of (25). Conversely, all 
twice differentiable solutions of (25) can be written in this form. To see this, rewrite 
(25) as 


(Di + Dx)(ur — Ux) = 9, (Di — De) (ty + ty) = 0. 
From these equations we conclude, respectively, that 
Up —Uy = a(x —t), up tuy = d(x +t), 


.. —-- Where a-and b-are-once- Deon Sayer es eee 
yields the representation (26). she iat es ALS ES 
What about solutions that are distributions? We claim that for any pair « of distri- 
butions £ and m in a single variable, ex 


u = O(x+t)+m(x — t) (26') 


is a solution of the wave equation in the sense of distributions. To verify this, we note — 
that it follows from example 6 in section B.2 that E(x +t), m(x — t) can be defined as 


558 THEORY OF DISTRIBUTIONS 


distributions in x, t, and that their partial derivatives can be calculated by the chain 
rule. 


Exercise 11. Carry out these steps. 
Furthermore it can be shown that every distribution solution of (25) is of form (26)’. 


Exercise 12. Show. _ A, oe - 9 aF 


What is the use of distribution solutions of the wave equation? Plenty! Take the 
propagation of acoustic waves, governed by the wave equation. For instance, the 
) -=-= -honking of a horn can be described as-a solution of the form (26)’, with £ and m 
functions equal to some constant c on an interval 7, and zero outside J. 
Other uses of distributions are given in section 7.2 and in Lax (1955). 


B.S THE FOURIER TRANSFORM 


We saw in the previous sections that the class of C§° functions is the natural do- 
main for some operators, such as differentiation. However, it is too narrow for other 
equally important operators, such as the Fourier transformation. For this operator the 
following larger space of functions, denoted as S, turns out to be natural: 

S consists of all complex-valued C° furictions u defined in R”, which, together 
with all their derivatives, tend to zero faster than any power of |x|~! as |x| + oo. 
That is, a function u belongs to S iff for all multi-indexes œ and all positive integers b, 


lim |x? D®u(x)| = 0. (27) 
|x| co 
Define the norms 
ree ~ lulb;a = max PDEU pei ar cot o (28) 


we can describe S as the collection of those C% functions for which all the norms 
(28) are finite. 


Exercise 13. Show that the finiteness of the norms (28) implies (27). 


Definition. A sequence {un} of functions in S is said to converge to u if 
lim |un — ulp,g = 0 for all b and all æ. 


Definition. Define the distance d(u, v) of two functions u and v in S as 


1 lu — vlipa 
(u, v) = EF 2b+lal IF llu— vloa. = 


the summation over all multi-indices œ and positive integers b. 


THE FOURIER TRANSFORM l 559 
Exercise 14. 


(a) Show that a sequence {un} converges to u iff d(un, u) — 0. 

(b) Show that d (u, v) as defined by (29) satisfies the triangle inequality. 
(c) Show that S under the metric (29) is a complete metric space. 

(d) Show that S is a linear space. 

(e) Show that cp is a dense subspace of S in the metric (29). 


Definition. The dual S’ of S consists of all linear functionals £ on S that are contin- 
uous; that is, if 


limun =u, then lim&£(un) = £(u). 
S’ is a linear space. Since C§° is a subspace of S, £ acts as a linear functional on Co’. 


Exercise 15. 


(a) Show that £ restricted to C8? is continuous in the sense of convergence in C§° 
defined in section B.1. 


(b) Show that if 2 and m are in S’ and £ 4 m, then £ 4 m acting on Co’. 


It follows from exercise 15 that elements of S’ are distributions; they are called 
tempered distributions. As before, we will use the notation £(u) = (u, £) for u in S, 
LinS. 


Here are some examples of tempered distributions: 
Example 14. Any distribution of compact support. 


Example 15. Any function v(x) that grows slower than some power of |x| as 
|x| > 00, €(u) = fvudx. 


We come now to the main topic of this section. 


--Definition.-The Fourier-transform-of-any-function u-of.class-5,-denoted-as-Fu,-and 
also as ii, is defined in the usual fashion by the formula 


“ae 


Guy 7 (30) 


(E) = (Fu) (£) = fios dx, where dx = 


Exercise 16. Show that |Fu|max < (27) 7" 4u] z1. 


Theorem 16. F maps S into S continuously. 


560 THEORY OF DISTRIBUTIONS 


. .. Proof. The integral in formula (30) can be differentiated with respect to £, giving 
D” Fu = F((ix)*u). (31) 


= Foru in S, xu tends to zero as |x| > oo faster than any power of |x|; therefore x%u 

belongs to L!. It follows that for any multi-index a, D®Fu is a continuous function. 

“This proves that Fu is C°°. To show that Fu is in class S, we deduce from (30) by 
. integration by parts on the right that for any multi-index £, 


Combining (31) and (32), we get EÊ DY Fu = i®tÊF(DËx®%u). This shows that 
ER D? Fu is a linear combination of the Fourier transforms of functions of the form 
x D°u. Since for u of class S these belong to L}, it follows from exercise 16 that 
their Fourier transform is bounded in R”. This shows that Fu belongs to class S. 


oO 
Exercise 17. Show that the mapping u —> Fu is a continuous mapping of S into S. 


The following theorem summarizes the relation of the Fourier transform to the 
usual operations in R”. 


Theorem 17. 


(i) E transmutes translation in R” into multiplication by ést, That is, define Ta 
by (Tau) (x) = u(x — a). Then 


FTau =e Fu and T,Fu = Fe!" u, 
(ii) The infinitesimal version of (i) is 
ha e e a, 


(iii) F commutes with rotation around the origin, and with reflection R defined by 
(Ru)(x) = u(—x). 
(iv) Let A be an invertible map of R" into R” . Denote u(Ax) as ua (x). Then 
1 


Sain a: 
idet aj 98 


Fu, 
where B = (Aq!) 
Exercise 18. Prove theorem 17. 


Exercise 19. Prove that the convolution of two functions u and v in S is in S, and 
that 


F(u * v) = (27)"/*(Fu) (Fv). 


ePRy = FD) u). O00 (32) 


THE FOURIER TRANSFORM 561 


The kernel of the Fourier transform is a symmetric function of x and £. Therefore 
F is its own transpose F’. So we can use theorem 2, the extension theorem, to define 
the Fourier transform of a tempered distribution 


(Fv, £) = (v, Fé) (33) 
for all v in S. 


Theorem 17’. Theorem 19 is valid for the Fourier transform of tempered distribu- 
tions. 


Exercise 20. Prove theorem 17’. 


For distributions £ with compact support, we can use definition (30) directly to 
define F£ as a C™ function by the formula 


GE) = (elt), £), e) = (20) iE, (34) 
Exercise 21. 


(a) Show that the Fourier transform of a distribution £ of compact support, as 
defined by (34), is a C° function. 


(b) Show that Fé as defined by (34) satisfies (33). 


We now present some examples of tempered distributions and their Fourier trans- 
forms; the first five are set in R.. 


Example 16. 


_ fl for[x| <1 ze _ | 2 sinx 
w= kat SS a 


Example 17. 


J l- af for|x| <1 ze, af atk = ROSE 
o=o for |x| > 1, (oer ees 


————— Example 18:—- cence a ee tS cate ene mee rete» See meen sl nnn ot ee 


Q(x) sel, EE) = = 


Example 19. 


562 THEORY OF DISTRIBUTIONS 


- Example 20. 


nysa Pi ag) ee PP 
ee eee er ae 


e=8, ËE = Qr)”, 


. Exercise 22. Verify, using formula (30) or (34), the pouer iinsforms a given in-ex- 
amples 16 to 21. 


Theorem 18. The Fourier transform of £ = 1 in R" is @ = (27 )"/28. 


Proof. Denote the Fourier ato of t= = ] byd. Meberdine to pai (i) of theo- 
rem 17’, for j =1,...,7, 


xjd = xjF] =F@D;1) =0. (35) 


According to exercise 6, if f is a C° function and £ a distribution such that f£ = 0, 
it follows that the support of £ is contained in the nullset of f. Applying this to 
f = x;j,j = 1,...,n, we conclude from (35) that the support of d is x = 0. 
According to theorem 5 it follows that d is of the form 


Steere ree cee ee D ea DS. f ne G6) 
ja] <N : 


It follows from (35) that for any multiindex £, |8| > 0, ee = 0. Combining this 
with (36), we conclude that for any £, |8| > 0, 


ie eae (37) 
_ tals 


We want to deduce from this that N = 0; we need the following lemma: 


Lemma 19. 
ro if |a] < |B| 
ors = fo if la| = |8|, a £ B 
(-1)l"la!é ifa = B. 


Proof. Let u be any CẸ° function: 
(u, xf D*8) = (x8 u, D®S) = (—1)!I(D% xu, 8). (38) 
For |a| < |8|, and for |a| = |8| but a # B, the function D% xu is zero at x = 0. 


Therefore for such values of œ and £, (38) is zero. For œ = £, (D®x%u, 5) = a! u(0) 
at x = 0. This is the assertion of lemma 19. o 


THE FOURIER TRANSFORM : 563 


Suppose now that N is > 0; then there is an a, ja] = N, for which cg Æ 0. 
Combining (37) with lemma 19, we conclude that cy = 0, a contradiction. Therefore 
we conclude from (36) that d = co. So all that remains to prove theorem 18 is to 
determine the constant cy. This is easily done; in the definition (33) of Fé, 


(Fu, £) = (v, Fé), 


set €(&) = landv = en /2, Fé = d = c6, and according to exercise 5, Fu = 
—(&)?/2. 
E $ 


(e78/2, 1) = (e* /?, coô). 


The left side equals tpn en / 2dE = (27r)"/?; the right side equals co, the assertion 


of theorem 18.. D 


Theorem 20. 


(i) E is an invertible mapping of S into S, and its inverse is given by 
u(x) = | UET E de. (39) 


(ii) F is an invertible mapping of S' into S', and E~! = FR, where R is defined 
in part (iii) of theorem 17. 


Proof. = 


(i) According to part (i) of theorem 17, Fei — T,F. This implies, by theo- 
rem 17’, that for every distribution £, 


Fe!45¢ = TaFE. 
Take £ = 1; we get, using theorem 18, that 
Femi = (2m)™ 8 (x — a). 
a Setting this into formula (33), we get for all v in S and @ = e~!@ that 


(5, e #45) = (27) w, 8(x — a)). 


The leftside is J deje 1" dé, the right side is (2x )"” u (a). This proves (39) = 
(ii) In (39) replace £ by —& as variable of integration; we get that F`! = FR. We 

can express this as FRF = I, the identity. So for any function v in S and any- 

tempered distribution £, 


(v, 2) = (FRFv, £) = (RFv, Fé) = (Fu, RF2) = (v, FRF£). 


This proves that FRF? = £, that is, that RF is a right inverse for F on S’, and 
FR a left inverse. Since F and R commute, this proves part (ii). a 


(S 


564 _ THEORY OF DISTRIBUTIONS 


Theorem a (Parseval’s Formula). The Fourier transform ii of every L? function 
u lies i in L? and lalz = = |lull p>. SELS 


f -Proof- We-wantfirstto-provethis for-u-in-S-Take the complex conjugate of for" 


mula (30) defining the Fourier transform: 


Fix | zetas, 


The right side can be written as RET, so _ 
Fu = RFT. (40) 
Since the Fourier-transform is-its.own a f 
(Œu, v) = (u, Fv). 
Now set v = Fu; using (40), we get 
(Fu, Fu) = (u, RFu) = (u, FRET) = (u, T), 


since we have shown in the course of proving theorem 20 that FRF = L. This proves 
the L? isometry of the Fourier transform acting of the class S. We get the result for 
any u in L? by approximating u with a sequence of functions of class S. O 


Exercise 23. Prove that &, defined as the L? limit of ñn, satisfies (33). 


In words, theorem 21 says that the Fourier transform is a unitary operator that 
maps LZR”) onto itself. 
The next result is about distributions in R!. 


Theorem 22. Define for anya in R! 


m 


summed over all integers m. 


(i) Pa is a tempered distribution. 
(ii) The Fourier transform of pa is (V27 /|a|) pp, b = 2x /a. 


Proof. Part (i) is obvious; To prove part (ii), we observe that Pa is periodic with 
period a: 


Ta Pa — Pa = 0. 


We take the Fourier transform of this relation; using the relation of Fourier transform 
to translation (see theorem 17’), we get 


(ei —1)pa = 0. (42) 


THE FOURIER TRANSFORM 565 


The function e~!*b = 2z/a, is = 1 at all points am, m an integer. Therefore it 
follows from the definition (41) of pa that 


—ibx 


e€ ` Pa = Pa- 


Take the Fourier transform of this relation; using again theorem 17, we get 
Tp Pa = Pa, (43) 


that is, that Pa is periodic with period b. 
Let u(&) be any function of class S that is zero at all integer multiples of b: 


u(nb) = 0, nin Z. 
Such a function is divisible by the function e/@5 — 1: 
ulg) = ef! yg), o 


v in S. Therefore for such a u 


(u, Pa) = (Ce? — 1)v, Pa) = (v, (eE — 1) Ba) = 0, (45) 


where in the last step we have used (42). 
It follows from (45) that if u (nb) = 0 for |n| > N, then the value of (u, Pa) 
depends only on the values of u (nb), In| < N. Since this dependence is linear, 


(u, Ba) = 9, enue(nb). (46) 
In|sN 


Since, by (43), Pa is b-periodic, it follows that all the cp are equal; we denote their 
common value by c: 


(u, Pa) =€ >) u(nb), (47) 


where c = c(a) depends on a. 

Any u in S can be approximated in the topology of S by a sequence of functions 
uy such that uy (nb) = 0 for |n| > N. Setting u = wy in (47) and letting N > œ 
we conclude that (47) holds for all u in S. We can restate this result simply in the 
notation of (41) as 


en one ee Pa E Cl) Pos BS (4B) 


Take the Fourier transform of both sides. Since Pa is an even function, the Fourier 
transform of pg is Pg, and so we get 
Pa = c(a) pp. (48’) 


Interchanging a and b in (48), py = c(b) Pg. From this and (48y we deduce that 
c(a)c(b) = | for ab = 2x. Fora = b = 2x we get c(./2r)? = 1, so c(V27) is 


either | or —1. 


566 THEORY OF DISTRIBUTIONS 


The definition of the Fourier transform of a tempered distribution such as Pa is 
that for all functions u of class S, (1, Pa) = (u, Pa). Using the definition (41) of P, 
and formula (47) for Pa, we get, with b = 27/a, 


$ alam) = c(a) ` u(bn). ae E) 


Given u and any number r, gee w by ur (x) = Uz): TEE FOUN rainstorm or 
uy is a oo ~~ : sarapa 


itr (E) EN ia ady È) : 
| th Ne MT = 


Setting u, into (49) in place of u gives 


ap ul (Cm) = = c(a) BG bn). (49’) 


We can use (49) with a replaced by a/r, and b replaced by rb, to express the left 
side of (49’) as 


ge 


Since this equals the right side of (49’), we deduce that (1/|r|)c(a/r) = c(a). Setting 
r =a/J2n we get (/27/|a|)c(./ 27) = c(a). Since we have shown that c(/277) = 
+1, we deduce that 


mes 
Tar p 


We claim that the correct sign is the positive c one. 2, For take any positive, even func- 
tion v of class S, and define u as v * v. Then u is positive, and since the Fourier 


c(a) = 


transform of the even function v is real, 7 = 2x02 too is positive. So it follows 
from formula (49) that c(a) is. positive as well. This completes the proof of theo- 


rem 22. o 


Setting c = v 27r / Ja] into (49) yields 


Poisson Summation Formula. For every function u of class S and for all real a, 


2 u(am) = — 53 u (=n). (50) 


Poisson’s formula is valid for a much wider class of functions than S. 


THE FOURIER TRANSFORM f 567 


NOTE. In chapter 30, section 30.6. we have derived Poisson’s formula as a special 
case of the trace formula applied to the convolution operator. We used there a differ- 
ent normalization for the Fourier transform. 


We indicate now how to extend theorem 22 to R”. Instead of all integer multiples 
of a, we consider all points of a lattice L defined as follows: 


Definition. A lattice in R” is a set L of vectors in R” with the following properties: 


(i) The sum and difference of vectors in L belongs to L. 
(ii) The set L has no point of accumulation in R”. 
(iii) The vectors in L span R”. 


Here are some examples. 
Example 22. The set E of all vectors with integer coefficients form a lattice. 


Example 23. The image of any lattice under an invertible linear map of R” —> R” 
is a lattice. 


Lemma 23. 


(i) Every lattice L can be represented as 
L=AE, (51) 


where A is an invertible linear map of R” — R", and E is the integer lattice 
described in example 1. 

(ii) The representation of L in form (51) is not unique. But in all such represen- 
tations |det A| has the same value. 


Proof. For a proof of part (i) we refer the reader to appendix 5 of a splendid text 
on linear algebra by the-author-of these pages_____ Nese eraan £ 7 
(ii) Let L = AE and L = A2E be two representations of L of form (51). It 
follows that A, lay maps E onto E. We want to show now that a linear map M 
that maps E onto itself has integer entries and determinant +1. Clearly, if M hada 
noninteger-entry-m;;;then-the-ith component of Me;,-where-e ;-is.the.jth-unit.vector, ...... .... 


--1s-mjj+contrary-to-the.assumption that. .maps.every. vector. in E into E... 


Since MT! also maps £ onto Æ, its entries, too, are integers. Since MM! = I, 
(det M)(det M7!) = 1. Since the entries of M and M`! are integers, their determi- 
nants are integers; therefore det Mf can have no other value than +1. Applying this 
to M = A7'Ai, we conclude that det M = (det A2)™} (det A1) = +1. g 


Definition. The dual L’ of a lattice’ L consists of all vectors b such that a - b is an 
integer for all a in L. 


568 THEORY OF DISTRIBUTIONS 


"Exercise 24. 


(a) Show that the dual of a lattice is a lattice. 0 
b) Show that L” = L. 
(c) Show that for every lattice L and every invertible linear map A of RË. —- 
R”, (ALY = BL’, where B = (A7!)’, 


Let L be a lattice; derioté by p; the tempered distribution 


PL= 9 (x —a). (52) 


ail.. 


Theorem 24. The Fourier transform of pL is 
PL =Cc(L) por, (53) 


where c(L) = (22)"/2 /|det A|, and A is a matrix that appears in the representation 
of L of form (51). 


Exercise 25. Prove theorem 24 by imitating the steps that went into the proof of 
theorem 22. 


Exercise 26. Formulate Poisson’s summation formula for lattices in R”. 


B.6 APPLICATIONS OF THE FOURIER TRANSFORM 


The Fourier transform is used in a large part of mathematics to solve, or at least 
reformulate, problems. Here are a few examples. 


Liouville’s Theorem. Let f(z) be an analytic function defined in the whole complex 
plane that is of polynomial growth: 


|f(z)| < const. (1 + Jz)“. 
Then f is a polynomial of degree < M. 
‘Proof. An analytic function satisfies the Cauchy-Riemann eguation 
| def = 4s —idy) f =0. (54) 


A function f of polynomial growth is a tempered distribution; so are its derivatives. 
The Fourier transform of (54) is 


GE +n) FE. n) = 0. 


FOURIER SERIES 569 


It follows from exercise 6 that the support of f is the origin. By theorem 5, such a 
distribution f is of the form 


f= a Cy DS. (55) 

jals N 
According to theorem 18, the Fourier inverse of ô is constant, so, according to the- 
orem 17’, the Fourier inverse f of the right side of (55) is a polynomial in x and y. 
Since f is analytic, it is a polynomial in z = x + iy. g 


Unlike the usual proofs of Liouville’s theorem, this one uses nothing of the theory 
of analytic functions beyond the Cauchy-Riemann equations. In fact the same proof 
gives the following much more general result: 


Theorem 25. Let P(Ẹ1,..., En) = P(E) be a homogeneous elliptic polynomial, 
that is, whose only real zero is § = 0. Let f be a tempered distribution that is a 
solution in all R” of the partial differential equation 


P(Dj,..., Da)f =0. 


Then f is a polynomial. 


Examples of such partial differential equations are Af = 0, A? f = 0, and many 
others. ; 
We give some applications of the Poisson summation formula. Set u(x) = e7*"/? 
into formula (40). Since ñ (E) = e512 we get that for all real a, 
Ja oe = > 5 ew bea. b= an 


m lal n a 


Denoting the function on the left as Z (a), we can rewrite this as 


27 23 
20) = 22 ( =), 


a 


2 
. =x? 24x : a Paice : 
Exercise 27. Set u(x) = e7* /?+* into the Poisson summation formula, and see 


_____ what you get. . EANET E ES 


B.7 FOURIER SERIES 


The Fourier analysis of periodic distributions is fairly straightforward. Let u be a 
C™ function on the unit circle S!. Its Fourier coefficients b, are 


by = Í ei UCO) dO 2x, 
kil 


570 THEORY OF DISTRIBUTIONS 


Exercise 28. 


(a) Show that for any N there is a constant such that [bn| < const. Jn|^. 

(b) Show that the partial sums of the Fourier series of u, up = be p bye 
-converge to u in the C™ topology. 

(c) Let £ be a distribution. Its Fourier coefficients Qn are defined by 


Vea oes an = (et Ly fan" EN T 


Show that for any C° function u, 


(uu; = > Di tig oe Bie ee 
(d) Let {an} be a sequence of complex numbers that satisfy 
lan] < const. in| 


for some N. Show that the a, are the Fourier coefficients of some distribution 
l 
on S’. 


BIBLIOGRAPHY 


Bochner, S. Vorlesungen über Fouriersche Integrale. Akademische Verlagsgesellschaft, Leipzig, 1932. 
Dirac, P. A. M. The Principles of Quantum Mechanics. Clarendon Press, Oxford, 1930. 


Lax, P. D. On Cauchy’s problem for hyperbolic equations and the differentiability of solutions of elliptic 
equations. CPAM 8 (1955): 615-633. 


Lax, P. D. Linear algebra. Series on pure and applied mathematics. Wiley-Interscience, 1997. 
Lützen, J. The Prehistory of the Theory of Distributions. Springer Verlag, 1982. 


von Neumann, J. Matematische Grundlagen der Quantenmechanic. Die Grundlehren der mathematischen 
Wissenschaften, 38. Springer. Berlin, 1932. 


Schwartz, L. Théorie des distributions. Hermann, Paris, 1950-1951. 


Sobolev, S. L. Méthode nouvelle à résoudre le probleme de Cauchy pour les équation linéaires hyper- 
boliques. Mar. Sb., 1 (1936): 39-71. 


Strichartz, R. Guide to Distribution and Fourier Transform. Studies in Advanced Mathematics. CRC 
Press, Boca Raton, 1994. 


Weyl, H. The method of orthogonal projection in potential theory, Duke Math. J., 7 (1940): 411-444. 
Wiener, N. The operational calculus, Math Ann. 95 ( 1926): 557-584. 


ZORN’S LEMMA 


Zorn’s lemma is a theorem in the Zermelo-Fraenkel system of set theory. It is equiv- 
alent logically with the axiom of choice. Thus its use introduces a highly noncon- 
structive step; therein lies its power. 

Zom’s lemma deals with partially ordered sets, nonempty sets where an order 
relation a < b is defined for some pairs of elements in the set, which satisfies 


(i) transitivity: ifa < b and b < c, then a < c. 
Gi) reflexivity: a < a for all a in the set. 


A subset of a partially ordered set is called totally ordered if for every pair x, y of 
elements in it, either x < y ory < x. 

An element u of a partially ordered set is said to be an upper bound of a subset if 
x < u for every element x in the subset. 

An element m of a partially ordered set is called maximal if every element b of 
the set satisfies b < m. 


Zorn’s lemma. If every totally ordered subset of a partially ordered set has an upper 
bound, then the partially ordered set has a maximal element. 


571 


AUTHOR INDEX 


Adams, R. A., 51, 550 
Agnew, R. P., 24, 28 
Ahlfors, L. V., 267 
Akhiezer, N. I., 159, 414 
Alaoglu, L., 120, 121 
Aronszajn, N., 275, 278, 282 
Arzela, C., 243, 245 

Ascoli, G., 243, 245 

Atiyah, M., 312 

Atkinson, F. V., 305, 313 


Banach, S., 19, 28, 168, 172, 260, 527 

Beardon, A. F, 511 

Berger, C., 312, 313 

Bernstein, A. R., 282 

Bernstein, S., 89, 138 

Beurling, A., 221, 225, 282, 513, 515, 516, 517, 
523, 525, 526 

Birkhoff, G., 151, 152, 156, 159, 251, 252 


` Birman, M. Sh., 490, 511 


Blaschke, W., 89, 522 

Bochner, S., 141, 144, 146, 149, 543 
Bohnenblust, H. F., 27, 28 

Bohr, H., 543 

Bohr, N., 459 

Boltzmann, L., 446 

Boutet de Monvel, L., 312, 313 

de Branges, L., 126, 132 


x 


——— preas Hr re e 


.. Brodsky,.M._S.,.280,.283 


Buskes, G., 28 


Calderon, A. P., 470, 475 


Calkin, J. W., 234, 244 

Carathéodory, 125, 128, 141, 142, 143, 149, 152, 
159 

Carleman, T., 393, 470, 475 

Carleson, L., 215, 225 

Cartan, E., 172 


~~ Enflo, P, 282,283 


Chernoff, P., 121, 429, 438 

Choquet, G., 128, 130, 131, 133, 135, 146, 150, 
151, 157 

Clarkson, J. A., 46, 51 

Cobum, L. A., 312, 313 

Conway, J. B., 527 

Cook, J. M., 482, 511 

Courant, R., 41, 246, 252, 318, 319, 322, 328, 
345 


David, G., 191 

Davidson, K. R., 279, 283 
Day, M. M., 51, 527 

Diestel, J., 131, 132 
Dieudonné, J., 303, 305, 313 
Dirac, P. A.M., 108, 543, 566 
Donoghue, W. F., 280, 283 
Doob, J. L., 378, 393 
Douglas, R. G., 311, 313, 527 
Dunford, N., 112,121, 527 
Dyson, F. J., 271, 273, 274 


Eberlein, W. F., 105, 107, 121 
Edwards, R. E., 527 


Faddeev, L., 492, 505, 510, 511 
de Finetti, B., 156, 159 
_Fischer_E.,.318,.319,.322, 328 


Frayn, M., 459 

Frechet, M., 57,62 75 

Fredholm, I., 260, 268, 274, 279, 342 

Friedrichs, K. O., 115, 117, 402, 412, 414, 432, 
502 

Frölich, J., 491, 511 

Garabedian, P. R., 71, 98, 94. 

Gelfand, I. M., 159, 195, 202, 208-210, 221, 222, 
224, 271, 273, 274, 371, 376, 436, 437 


573 


574 AUTHOR INDEX 


Gilkey, P. B., 352,353 Lax, P. D., 57, 62, 68, 71, 98, 253, 259, 273, 274, 


Gohberg, I. C., 305, 311, 313,-341,-353-________ 3313, 353, 427,428,-447, 450, 460, 465, 
Goldberg, S., 341, 353 467, 470, 475, 476, 493, 500, 511, 523, 526 
“Goldstein, TAS 4T6 424 F29 ABBA D C527 Tee e m 
Gödel, K., 459 Lengyel, B., 393 
Grothendieck, A., 353 ` Levi, B., 62 
Guillemin, V., 312, 313 Levitan, B. M., 271, 273-274 
LEARE E TTE -raai lae zanski,-T:, 353 25, OPERETA meminna n EE iiit 

Hadamard, J., 178, 262, 267, 543 oe oe be Lidskii, V. B., 329, 334, 336,341,353, 
Hahn, H., 19, 28 Lindelöf, E., 284 
Halmos, P. R., 185, 191, 515, 523, 526 Lindenstrauss, Y., 527 
Hamburger, H., 150, 159, 410, 411, 414 Littlewood, J. E., 191, 224 
Hankel, H., 50, 188, 376 Lomonosov, V. 1., 275, 282 
Hardy, G. H., 191, 224,.513 vit me ae eame Lorch; Ee Roy 393~------—- ` 
Hausdorff, F., 34, 35, 122, 123, 180, 191, 221 Lumer, G., 432. 434, 438 
Heisenberg, W., 414, 450, 455, 459, 460, 494, 

511 Marchenko, V. A., 271, 273, 274 
Hellinger, E., 367, 375-377, 393 Mazur, S., 49, 51, 103, 107, 206 
Helly, E., 107 Mazya, V., 51 


Herglotz, G., 116, 117, 142, 149, 150, 159, 378 McCoy, B. M., 307, 314 
Hilbert, D., 52, 62, 176, 180, 188, 201, 244, 246, Melrose, R., 475, 476 


; 252, 260, 315, 322, 375, 376 Mercer, T., 343, 353 
Hilden, 275 Mikusinski, J., 525, 526 
Hille, E., 416, 424, 426, 432, 438, 527 Milgram, A., 57, 62, 68 
Holmgren, E., 176 ; Milman, D. P., 78, 86, 122, 124, 156 
Hopf, E., 312, 443, 460 Morawetz, C. S., 475, 476 
Hoyt, F., 492 Morrison, T. J., 527 
Hérmander, L., 91, 191, 304, 305, 313 Morse, A. P., 24, 28 

Morton, W., 446 
Ikawa, M., 475, 476 Moller, C., 492, 512 

Miiller, C., 470, 476 
Jauch, J. M., 482 Miintz, Ch. H., 88, 98, 286 
Jentsch, R., 259 - 
Johnson, W. B., 527 Namioka, I., 31, 35, 132 
Journé J. L., 191 Nehari, Z., 313 

= : a a von Neumann, J., 53, 62, 63, 71, 275, 377,378, 
Kakutani, S., 82, 86, 127, 129, 134, 529, 553 389, 393, 400, 414, 443, 450, 457, 460, 479, 
Kashoek, M. A., 341, 353 512, 543 
Kato, T., 414, 415, 429, 438, 477, 479, 490, 511 Nevanlinna, R., 379, 393 
Kelley, J., 31, 35, 132 Nikodym, O. M., 63, 71 
Kendall, D. G., 156, 159 ~ Noether, F., 301, 305, 314 
Kiefer, 156 
König, D., 152, 159 Osher, S. 307, 314 
König, H., 353 Oxtoby, J., 157-159 
Koosis, P., 215, 225 
Koopman, B. O., 378, 393, 445, 460 Paley, R. E. A. C., 251, 252, 494, 523 
Kramish, A., 459 Parter, S.V., 307, 314 Š 
Krein, M. G., 31, 122, 124, 147—148, 156, 159, Pavlov, B., 500, 505, 510, 51 1 
259, 311, 313, 341, 353 f Perron, O., 253 
Krupnik, N. Ja., 311, 313, 353 Phelps, R. R., 130, 132 
Kuroda, S. T., 479, 511 Phillips, R. S., 416, 424, 432, 434, 435, 437, 438, 
450, 460, 465, 470, 475, 476, 493, 500, 509, 

Landau, H., 159, 414-415 510, 513, 527 


Lawler, G. F., 258-259 Phragmén, E., 284 


AUTHOR INDEX 


Pietsch, A., 353 
Poincaré, H., 246, 252, 500 
Pólya, G., 191 


Radjavi, H., 283 

Radon, J., 63,71 

Ralston, J., 475, 476 

Rayleigh, Lord, 317, 447 

Reed, M., 414, 415, 492, 512, 527 

Rellich, F., 246, 252, 327, 406, 415, 432, 477, 
478, 504, 505, 512 

Retherford, J. R., 353 

Richtmyer, R., 428 

Riesz, F., 43, 57, 62, 82, 86, 116, 117, 127, 129, 
134, 149, 150, 159, 234, 238, 244, 260, 378, 
393, 527, 530, 553 

Riesz, M., 177, 183, 189, 191 

Ringrose, J. R., 277, 278, 283 

Robinson, A., 282 

Rogosinski, W. W., 94, 98 

Rosenblum, M., 479, 512 

Rosenthal, P., 283 

Rota, G. C., 251, 252 

Royden, H., 245 

Rudin, W., 146, 159, 527 

Rüst, B., 459 

Rutman, M. A., 259 

Ryll-Nardzewski, 525 


Sarason, D., 18, 314 

Sarnak, P., 509, 512 

Schauder, J., 172, 243, 244, 335 
Schiffer, M., 71, 94, 98 
Schmidt, E., 176, 260 

Schmidt, P. K., 459 
Schrödinger, E., 402, 477 
Schur, L., 48, 101, 191 

Schwarz, H. A., 52 


~~ Schwartz, J. T., 12173417446, 460, 5277 T 


Schwartz, L., 145-147, 159, 286, 299, 543 
Selberg, A., 349 
Shapiro, H. S., 94, 98 


Shiffman, M., 98 


“"Shohat, J AS 159,414, 4157 


Sikorski, R., 353 

Simon, B., 414, 415, 492, 512 
Sinai, Ja. G., 450, 458, 460 
Singer, I., 312 

Smith, K. T., 275, 278, 282 


‘ 


Smulyan, V. I., 121 

Sobcyk, A., 27, 28 

Sobolev, 40, 42 

Sokal, A. D., 258, 259 
Soukhomlinoff, G. A., 27, 28 
Spencer, T., 491, 511 

Stieltjes, T., 411, 414, 415 

Stone, M., 126, 376, 378, 415, 440, 460 
Strang, G., 312, 314, 429, 438 
Strichartz, R., 543 

Sunder, V. S., 185, 191 

Szász, O.,.90, 98 

Sz. Nagy, B., 251, 252, 376, 393, 527 


Tamarkin, J. D., 159, 414, 415 
Tauber, A., 224 

Taylor, A. E., 196, 201, 527 
Taylor, M., 191, 475, 476 
Thorin, G. O., 178, 191 
Titchmarsh, 280, 523, 525, 526 
Toeplitz, O., 108, 117, 142, 305, 377, 393 
Trotter, H. F., 429, 433, 434, 439 
Turing, A., 526, 

Tychonoff, A., 121, 155, 208 
Tzafriri, L., 527 


Uhl, J. J., 131, 132 
Ulam, S., 49,51 


Volterra, V., 231, 232 


Weyl, H., 69, 71, 191, 326, 328, 341, 457, 
458, 460, 479, 512, 555 

von Weizsiicker, C., 459 

von Weizsiicker, E., 459 

Wheeler, J., 492 

Wielandt, H., 457, 460 

Wiener, N., 201, 213, 215, 218, 225, 251, 
252, 312, 494, 523 

Wintner, A., 376 

Wolff, T., 215 

Wolpert, S. A., 509, 512 


~-Yood,B. 302,314 


Yosida, K., 416, 424-426, 432, 439, 449, 
527 : 
Young, W. H., 180, 191 


Zaremba, S., 65, 71 
Zeidler, E., 527 


A 


SUBJECT INDEX 


Absolute value, 330 
Almost orthogonal bases, 251-252 


`. Analytic function 


entire, 265 

positive real part, 115 

resolvent, 195, 382 

strongly, 111 

weakly, 111 
Anderson localization, 491 
Annihilator, 76, 244 
Approximation 

by powers, 89-90 

by weighted polynomials, 88 

of the 5 function, 108 
Arzela—Ascoli theorem, 243 


B* algebra, 222 
Baire category principle, 169 
Banach algebra, 192, 435 
Banach limits, 31 
Banach space, 38 
reflexive, 78, 82 
uniformly convex, 45 
Bessel’s inequality, 60 
Blaschke product, 89, 522 
Borel subsets, 354, 361 
Bounded linear maps, see Linear maps 


--Capacity,-239. ~- -= vmo m 


Cauchy transform, 381 
Cayley transform, 389, 400 
Closed graph theorem, 170-171, 377 
Compact, 43 

map, 233 

noncompact, 43 
Completely monotone functions, 137 
Convexity, 5 

closest point, 45, 54 

convex combination, 5 


convex function, 135, 337 

convex hull, 6 

extreme point, 6, 124 

extreme subset, 6, 124 

Riesz theorem, 177 

subset, 45 

uniform, 45 
Convolution operator, 216, 323, 348 
Cross norm, 479 


Deficiency index, 400 
Discrete subgroup, 501 
automorphic, 501 
fundamental group, 501 
fundamental polygon, 501 
Determinant, 341-342. See also Fredholm 
integral equation , 
Dimension, codimension, 4 
Dirichlet’s problem, 65, 94, 112, 327, 405, 462 
Distribution, 113, 543 : 
delta function, 108, 544 
Fourier transform, 558-559 
support, 547—548 
tempered, 559 
Dual, 72 
of C(Q), 82 
of L’, 79 
variational problems, 76-77, 86 


Eigenvalue, 229, 238, 248, 253, 266, 
287 

algebraic multiplicity, 267, 278 
Eigenvector, 229, 248 
Elliptic PDE, 249, 258, 286, 461—462 
Energy, 447, 454 

decay, 473 
Equivalence theorem, 427 
Ergodic mapping, 157~158, 443 
Extreme point, see Convexity 


577 


578 


Fourier transform, 180, 231, 559 
Fredholm integral equation, 260 
alternative, 260 
determinant, 260, 263 
resolvent, 260 
Friedrichs extension, 402-406, 412 
Functional calculus, 197, 200, 320 


Galerkin’s method, 115 .. 
Gelfand 
compactification, 210, 212 
representation, 208 - 
_. topology, 208 hy 
Geometrical optics, 475 
Gram-Schmidt process, 61, 335 


Hahn—Banach theorem, 19, 29 
complex, 27 
extension, 24 
geometric version, 21 
Hamiltonian flow, 440 
Hankel 
matrix, 313 
operator, 312 
positive, 50, 410 
Hardy space, 305, 513 
Heisenberg 
commutation relation, 455, 456 
politics, 459 
uncertainty principle, 456, 457 
Helly’s theorem, 107, 116 
Hilbert 
space, 48, 52 
transform, 181 
Hilbert-Hankel operator, 188 
Hilbert-Schmidt operator, 176; 341; 352; 
479 R 
Hille~Yosida theorem, 424, 440, 462 
Hdlder’s inequality, 41 
Huygens’ principle, 454 
generalized, 475 
Hyperbolic equations, 186 
Hyperbolic plane, 500 
geodesics, 501 
metric, 500 
motion, 501 
wave equation, 505 


Ideal, 203 
maximal ideal, 205, 208 
principal ideal, 204, 518 
two sided, 234 
Incoming 
data, 454 


SUBJECT INDEX 


solution, 454, 473 
subspace, 470, 473 
Index, 15-18, 275, 300, 349-352. 
Infinitesimal generator, 421 
Integral operators, 173, 180, 246 
Fredholm, 260 
Volterra, 231 
Interior compactness, 286 cP go 


-Interior-point, 21, 123 


Invariant subspace, 10, 12, 275-277 
nested, 278-279 

Isometry, 47, 329 

Tsomorphism, 4 


Kernel, of integral operator, 173, 260 


Laplace—Beltrami operator, 502 
Laplace operator, 66, 68-69, 94, 249, 259, 447 
Laplace transform, 183, 421 
Lattice, 275, 567-568 
Lax—Milgram lemma, 57 
Lax—Phillips 

scattering matrix, 495 

scattering operator, 493 

scattering theory, 493-499 

semigroup, 470-472, 499 
Lebesgue decomposition theorem, 364 
Linear functionals, 56 

bounded, 72 

continuous, 72 
_ extension of, 19, 74 

positive, 29, 133 
Linear map, 4, 8, 49 

algebra of, 8, 168 

bounded, 160, 173 

compact, 233 

continuous, 160 

degenerate,12, 234 

index, 15, 236 

pseudoinverse, 12 

self-adjoint, 160 

symmetric, 185 
Linear space, 1 

dual, 72 

isomorphism, 4 

normed, 36 

quotient, 4 

reflexive, 78 

separable, 42 

span, 3, 58 

subspace, 3 

sum of, 3 
Liouville’s theorem, 564 
Locally convex topological linear space, 122 


SUBJECT INDEX 


Mapping, see Operator 
Measure, 82, 361 
absolutely continuous, 63, 365 
projection valued, 354 
Metrically transitive, 446 
Modular group, 501 
Moment problem 
Hamburger, 150, 410-411 
Stieltjes, 411-412 
Multiplicative functional, 202 


Neumann boundary condition, 250, 328 
von Neumann algebras, 312 
Norm, 36, 52 

strictly subadditive, 44 
Normal operator, 321, 370-372 
Normed algebra, 192 


Open mapping principle, 168, 170, 172 
Operator 
bounded, 160, 354 
dissipative, 433 
normal, 354, 370 
symmetric, 354 
unitary, 354, 372, 389 
Orthogonal, 53 
complement, 55 
orthonormal, 59 
Outgoing 
data, 474 
solution, 473 
subspace, 471, 474 


Paley—Wiener theorem, 494, 524, 525 
Parabolic equation, 190, 250 
heat equation, 188, 250, 461-462 
Parallelogram identity, 53 
Parseval relation, 306, 463 
Phragmen—Lindel6f principle, 284 
abstract, 286 
Poisson summation formula, 349, 566 
Polar decomposition, 329, 330, 362-363 
Positive definite 
cos eyel y, 147s oo nes oe Bei 


functions, 144 

sequences, 142 
Precompact, 233 ae fs 
Principle of uniform boundedness, 10 
Projection, 172, 241 


Quantum mechanics, 414 


Radon-Nikodym theorem, 63, 365, 366 
Rayleigh quotient, 317 


579 


Rellich 
compactness criterion, 246, 249, 251, 327 
perturbation theorem, 406—410 
Resolvent, 195, 238, 241, 382 
identity, 198 
set, 194, 419 
Riemann hypothesis, 511 
Riemann ¢ function, 510 
Riesz-Frechet representation theorem, 57, 66, 355 
Runge’s theorem, 9] 


Scalar product, 52 
nondegenerate, 59 
Scattering theory, 273, 477 
inverse problem, 273 
operator, 491 
potential, 490—491 
reflection coefficient, 273 
Schrödinger operator, 402, 414, 440, 492 
Schwarz 
inequality, 52 
reflection principle, 500 
Selberg trace formula, 349 
Self-adjoint operators, 377-378, 396 
essentially self-adjoint, 398 
perturbation, 406 
Semigroup, 416 
approximation, 427 
generation, 424, 440 
spectral theory, 434 
strongly continuous, 418 
transpose, 438 
uniformly continuous, 417 
weakly continuous, 424 
Sesquilinear, 52 
Singular value, 330 
Sobolev 
inequality, 42, 550 
norm, 40, 298 
space, 40 
Spectral mapping theorem, 200 
Spectral multiplicity, 366-370 
Spectral radius, 195, 277 


________—Gelfand’s formula.195—_. Lee ee aes 


Spectral representation 


symmetric operators, 364-370 ve 
„Spectral resolution 

compact maps, 238-242 

compact normal operators, 321-322 

compact symmetric operators, 316-320 

normal operators, 371-372 

self-adjoint operators, 377-389, 390 

symmetric operators, 358-364 

unitary operators, 372-375 


580 ` SUBJECT INDEX 


Spectrum, 194, 238, 241, 253, 256 Unitary operator, 322 


absolutely continuous, 364 : 
compact operator, 238, 316 Wave equation, 187, 447, 454 
point spectrum, 238-242, 486 automorphic, 505 
symmetric operator, 356-357 hyperbolic, 505 
unbounded operator, 419 ` incoming data, 454 
Stochastic integral operator, 256 Wave operator, 480—489 
Stone—Weierstrass theorem, 126 invariance, 490 
Strong aa o Weak, 
i analyticity, 111 analyticity, 111 
convergence, 99 compactness, 121 
convergence of sequences of maps, 165 convergence, 99 
solution, 115 convergence of sequences of maps. 
Too topology of lear maps, 165 oo i 165 ` = 


x 


Support function, 83 sequential closure, 118 
sequential compactness, 104 


Tauberian theorem, 220 i solution, 115 


Three lines theorem, 178 topology, 118 
Titchmarsh convolution theorem, 280, 523-525 topology of linear map, 165 
Toeplitz operator, 306-312 ` Weak* 
matrix, 307 Ta compactness, 120 
symbol, 306 convergence, 105 
Trace, 267, 305, 329-330, 333-334 sequential compactness, 106 
class 329, 330-334 topology, 118 
formula, 267, 329, 334-341 Wiener—Hopf operators, 312 
_horm, 330, 479 Winding number, 307 


Translation representation, 448-450 
Transpose, 163 Zaremba’s inequality, 65 


